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ABSTRACT 



Data bases of text materials such as English Language 
abstracts of documents are 'difficult to represent in an 
information system. Results of numerous investigations indicate 
that in* mahy si tuati ons di f f erent document Vepresentati ons are, 
on the average, approximately equally effective. However, recent 
research findings indicate -that different representations 
retrieve different subsets of documents (and re levant documents) 
from data bases. 



This study investigated document representations in two 

different * data bases and analyzed" the, overlap^ ^among the 

representations (extent ""to which - the sajjie docdments were 

retrieved) as, well as thetr performance. Using a technical data 

base, seven documervt representations were i nves t4 gated . The 

study was repeated with* a less technical -data ^base using four 
representations. 

Results indix:ate major differences betvfeen th'e two data 
bases *in . terms ^f which representatixins performed most 
effectively within ^^ach data base. The ^^[^v^rlaps among the 
representations were consisteatly low.* D%Werpnces . we re also 
found * between .search i n teriHredi ari es ^ and* between the 
representations. Results were^ also discussed in terms of the 
incremental] effectiveness pf representations i.e. what is the 
cumulative^ improvement ^n ^retrieval performance as 
representation's are'adde^d sequentially? \ , 

A probabi 1 fstic model of overlap was developed based on the 
assumptjiofi of random retrieval , the model was f i tttf%_agai ns t the 
obtai ned asymmetri c overlaps and against the i ncremental 
improvements :obtained by the different representations. In 
gerveral , the model fit^^+rese data reasonably well. 



\ 



ACKNOWLEDGEMENTS 



This siudy required the" hplp and support of ' many 
individuals and o^anizations 'in a variety of ways. 
I would like to take this opportunity to public!/ 
acknowledge their assistance. 

The Project Staff carried out the work with good ^ 
cheer' and quiet efficiency. Though th^ had their Qwn 
responsibilities, they worked as a grc<||^ and should be 
coirmended as a grOup;' Padmini DasGupta, William FraJces , 
Cilery 1 McAfee, Margaret Montgomery and Judith Tessier . 
In addition, several individuals served as consultants 
to the Project: Matthew Kbll, Terry Noreault and 
Robert Waldstein. Many others, not officially on , the 
Project were also helpful — especially Robert N. Oddy 
a;id Linda Smith. To all of these people: Thank Youl^ 

I also want to thank a. few organisations for their 
assistance. -' Both INSPEC and Psychlnfo were very helpful 
by making portions of their data baseS© available to the ^ 
Project. Information Services and Research was . 
responsible for obtaining professional intermediaries 
to carry out the searches in both Phases of thfe Project. 
Lastly, the ^School of Information Studies must be 
credited for \providing an environment where . research 
„of this type \s encouraged and supported. 



Jeffrey Katzer 
Principal Investigator 

m 



. ■ - • ■ / 

TABLE OF CONTENTS . 

Page 

I. INTRODUCTION 1 

II '. . OBJECTIVES . .' 3 

III, OVERVI^ ' . • 4 

IV. RETRIEVAL ENVIRONMENT 6 

A. Data Bases . , « 6 

' B. Retrieval System , . . . . 7 

C. Search Intermediaries . . . 7 

D. Users and'Queries . . . * 8 

E. Relevance Judgements 8 
• ' ^ ,> 

-V.^ METHODOLOGY. ^p- 

A. Variables ^ f ^-^ 

^ B. Procedures - 

( 

C. Design and Analysis^. -15 

. VI. RESULTS • • 

■ / - : ■ . 

A. Analysis of Performance -Lo 



^. Analysis of Overlaps 



ERJLC 



23 



VII. DISCUSSION -^2 

A. Data Bases and Indexing . . ' 32 

B. Descriptive Models of Overlap ^ 33 

C. Theoretical Models of Overlap . . 1 . . . 39 

REFERENCES . ....... • 



TABLE OF CONTENTS, cont^feued 



APPENDICES 



A. Training Materials 

B. Instructions to Participants, . . 
Relevance Judgements 

C. Directions to Users ......... 

D. Forins for Searcher, Attached to Query 

E. Latin Square Design ... 

F. AOV Summary Re'sults, Phase I . . . . 

G. AOV Sumjnary Results, Phase II ... 

H. Derivations of Theoretical Models 



/ 



TABLE OF TABLES 
/ 

Overview of Phase I and Phase II • • 

Characteristics of U^ers in Phase I . . . . * . . 

Characteristics of Users in Phase II ....... 

Document Representation 

Overlaps Among '*Best" and "Worst" 

Performing Representations 

Macro- performance Means and Number of 'Queries . 

Significant Differences in Macro- performance . 
^<^g Represent^ tio.ns . 

Micro-performance Means j • • 

Symipetric Pairwise Overlaps - Phase I .... 

Asyirtmetric Pairwise Overlaps - Phase I . . . 

Union Pairwise ^Overlaps - Phase I . 

Symmetric Pairwise Overlaps - Phase II . . 

•Asymmetric Pairwise Overlaps - Phaser II .... 

Union Pairwise Overlaps Phase II 

Representations Ordered by Incremental .... 
Improvement - Phase I * 



Representations Ordered by Incremental . . . . 
Improvement - Phase I and Phase II 

Maximum and Minimum Contributions of Seven . 
Representations - Phase I ^ 

> 

Maximum and MinimW Contributions of Four . . . 
Representations - Phase I and Phase II 

^ . ■ ""^^ 

Predicted and Obtained Asymmetrical . • ^ . • ' 
Overlaps 

Predicted and Obtained Incremental Improvements 
in Recall - Phase I ^ 

I.' \ 

' ■ ' ■ O 



' Page 1 

^ .1. INTRO-DUCTION 

« 

This report presents ■ the results of the Document 
Represen1:ati on Overlap ^Study* The report, contai ns the, research 
background and objectives, the procedures used, the findings 
obtained, and a discussion of these findings. The study was 
designed to contribute to our knowledge c^f the effect of the 
representation of information items on i nf ormati osa^^sy s tern 
performance. 

Past studies have found tl^at when using recall and precision 
as performance measures, the differences among various 
representations (such as free-text term, or descriptor phrase) 
have 'not been consistently evident. Studies to date have 
examined the precision and recall performance ef twa or more' 
representations. The results^ of those studies af^e esqui vocal . 
For example, Cleverdon ( 1967 ), Keen, { 1973 ), /"Sal ton (^968, pp. 
316-349), and McGill (1979) report no sizeable differences among 
the representations thej^ exami ned . On the other hand the results 
from" the second Cranfield Projects and from studies by Salton 
(1^73), Sparck-Jones ^j/d Jackson (197\0), Hersey, et al . (1971J,^ 
and Sparck-Jones (1974) reported^ differences in average 
performance levels. 



This study takes as its departure evidence that performance 
measures have masked real and systematic differences among the 
representations. Specifically, different representations result 
in the retrieval of different items. ^ 



♦ One of the more recent studies supporti'ng this assertion was 
conducted by Williams (1977).* She computed the, overlap amon^ 
five different document representations in a random sample of 50 
documents taken from Chemi cal Abstracts . No queries were 
obtained from users, rather representations wfere compared for 
matching, terms. The results gava the degree of uniqueness or 
Icfck of overlap among representations. 'Title, for example is 
claimed to^ be an important representation for retrieval because 
an average. of two tit|e terms per document did not appear in 
other representations. Smith (1979) provided some indication of 
tne overlap among seven .do>cument representations in a portion of 
the INSPE£. data base.' 'No us^rs were employed; a random sample 
of 35 documents were selected and treated-as queries. None of 
the average conditional probabilities (measures of asymmetrical 
overlap) exceeded .5, meaning that ' different document 
representations tended to retrie.ye different documents. A third 
study (McGill, 1979 ) compared documents retrieved using freer-text 
and controlled terms in a portion of the ERIC data base. Users^ 
provided queries which were searched and rele>/ance judgements 
"obtained". Thirty-^ree of the queries were selected for -a study 
of overlap. When each of the intermediaries searched bot^) 
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was 



only 14%. 
di f f erent 

this situation, the average overlap dropped 
figures are surprisingly -low indicating 



document representations, • the average overlap 
Gather queries were searched by intermediaries using 
ce.presentati ohs . In 
to 5%. Both of these 



that .users retrieve t^uite di ff erent "sets of documents when the 
free and controlled representations are used. 



These studies, as well as other investigations of the 
effectiveness of combined repV-esentati ons , have somewhat limited 
conclusions for three ' reasons: (1) usually only very feW 
(usually two) representati on!r were included, (2) often a sing.le, 
very small data base was used, and C3) overlap was typically 
examined by itself, without any consideration of the 
effectivenes.s of the representations. The study reported "here 
builds on the previous work, but examines both performance and 
overlap^ of up to seven representations in two - different, 
'moderately sized (12,000 document) data bases. 
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IK OBJECTIVE-S 



The assessment of the vari ou5* representati onS is conc^'erned 
with a number , of speci fi c * object! ves : , . 

(1) To determine if the i nf ornj^a ti on items retrieved by -the 
differing represe,ntati ons are si gni f ica^ntly/ and substantially 
different. ^ | 

. (^2) To assess the effectiveness of- representations 'or 
combinations of represen ta ti ons . 

( 3 ). To devel op and test a theoretic model sufficient to 
explain any differences in information retrieval system operation 
based on chang-es in the representation of i nf ormati..on items. 



III. OVERVIEW 



To achieve these obje.cti ves., ' it. w.as ^ necessary t^o submvt 
search r'equests to alternative representations of a data base and 
to design. t'he study so that me^ures of^ performance (of each 
repres^entati onj and over! ap •( among representations.) co^ld be 
obtained. The" basic study was repeated a sec.ond time so that we 
could determine if the resu1 ts>^ were consistent when a different 
data base was employed, ^ , 

The two phases of this i^nvfes tigati on corre^spond to ^the two 
data bases employed. In general , ' both phases w'^ere similar: a 
data base was acquired and loaded into Hhe DIATOM retrieval 
system. Real user.s provided written- queries which were then 
given to trai ned i n termedi ari es who wer'fe' i ns tlLUC ted to construct 
and submit high-recalK searches" to the system. The 

intermediaries^ were restricted -to particular ^ documenrt 

representati o^tTs for a given search, using a baland'ed design^ 
that each • i ntermedi ary used each documen^t representation an equa\ 
number of "times. The results of the searches entered for,.a given 
query were merged -and given back to the user "for relevance 
judgem.ents, • - * . 



Each phase of-this study used a different data, base. ^ In 
addition, the two 'phases differed in two , oth'ei^ important ways: 
(1) the 4nalysi.s design differed, and as a result, (2)"the humber 
of document representati ons "^nd intermediaries differed, Ln 
Phase I ,^ seven representations were used, Eabh i ntermedi a^ry used 
each representation on one-seventh of^the queries.^ Consequently; 
the^re was a possibility that intermediaries would be 'confounded 
with representations thereby hamfJering a clear i pterpretati o'n of 
th^resul t-s . of overlap documents, Thts possibility wa/S [Drevented 
^^n \Phase II; eax:h intermediary searched each query separately 
under all of the representations, ^ ^ 

A summary pf the charac teri s ti cfs of the two Phjases. of the 
study i s presente^d i n Ta-bl e 1 .6^ . ^ ' , / [ 
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• . T able 1 

X ' - Overview of Phase I and P hase II 





'Phase I 


Phase II 


Duration 


' 2/80 - 3/81 


■ 3/81 - 2/82 


Data Base 


INSPEC (Computer & 
Control Abstracts) 
9/79 - 12/79 


Psychlnfo (Psycho- 
logical Abstracts) 
7/80 - 12/80 


Niomber of 
Documents 


^ 12,000" 


12,000 


Retrieval^ ' 
System v 


DIATOM 


DIATOM 


Number of Users 
Number of Queries 


69 


•1. 

45 . 


84 . 


52 


Number of ; 
Intermediaries 


7 


4 

T 

/ 


Number of 
Representations 


^ 7 ' 


4 


Type of Design 


7^7 Latin Square 
replicated 12 
times 


4x4 factorial with 
repeated measures 



* 
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IV. RETRIEVAL ENVIRONMENT 
A. Data Bases 

« 

For Phase I, permission was granted l?y the Institution of 
Electrical Engineers to use the Computer and Control Abstracts 
portion (9/79 - 12/79)^ the INSPEC data base. For PKase TTT 
the Psychlnfo Use Service granted permission to use a portion of 
the 1980 data base (July - December) whose, pri nted counterpart's 
P sychol ogi cal Abstracts . Each data base consisted of 
approximately 1*2 ^000 documents . TJie choice of these two 'data 
bases and the number of document's used insured that sufficient 
documents would be, retrieved by each document representation. 

* . " ' ' ■ «?i 

Each document consisted of a series of bibliographic 
citation fields, the^ abstract, and some i ndexi ng i nf ormati on . 
The format of each document record as it was printed upon 
retrieval is given below. 

INSPEC ^DNnumber (abstract numbers from INSPEC journals) 
Title 

Authors (separated by commas) 
Source' Field: as follows 

Publication: (volume and issue number) 
(part number) pagination data 
'following this may be i nf ormati on i n ( ). 
This is information on th-e cover-to-cover 
translation as follows: (publication; (volume 
i and issue) pages, (date) (type of uncdnventional 
'media) (availability) (Title of Conference) 
(location of conference) (sponsoring 
organization) (date) language). , 
Abstract - 
Indexing Information 

. ' / ' 

Psychlnfo DNnumber (abstract numbers from PsychAbs journals) 

Title 

Authors (separated by semi-colons) 

Source : as f ol 1 ows y- 

Journal name' / . , 

Publ i cati on date 

Volume and issue number, pagination. 
Section Code: content classification assigned 

to sections of print PA 
Abstracts : Abstracts. (75-175 words) used for 

articles directly releva^nt to psychology, 

annotations for less central items. 
Indexing Information: Descriptors 

Iden ti f i er s 
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B. Retrieval System 

DIATOM, an on-line retrieval system which was designed to 
simulate most of the features of Dialog, was used to conduct all 
the searches'in this stiiftly. DIATOM was designed and programmed 
by Robert .Waldstein (1981), a » PhD student at the School, of 
, Information Studies* ' • 

» ■ 

The major differences between DIATOM arfd DIALOG are listed 
below. 

\1. DIATOM permitted the searchers to Vog on directly to a 
particular representation. All search statemen.ts were 
subsequently restricted to that representation only. . 

?s The system included a stemmer used for the stem- 
representation in f^hase I. ' 

3. To restrict a search to a particular language, a Limit/ENG 
(for English) was used.. 

4. Adjaceflcy (nW) could not be used with either truncation or 
stemming. \^ 

5. Adjacency at times ran very slow; the f-ield operator (F) 
could be used instead. 



C . Search I n termed i ar i es 



All of the intermediaries used in this study were 
professional librarians or information brokers with experience 
using computerized retrievetl systems; all had some experience 
using DIALOG. 

Before Phase I , the seven intermediaries took part in a 
day-long training session. Afterwards, each intermediary was 
required to become familiar with DIATOM and the INSPEC data base. 
Each intermediary submitted fourteen practice searches. A copy 
of the training materials provided the intermediaries is given in 
Appendix A. 
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Four of the search intermediaries employed in Phase I were 
used again In Phase 1 1 . * Each 1 n termed! ary took part In a three 
hour training session and was requlre^d to submit two practice 4 
searches to the system. ^ 

D Users arid Querl es \ 

Users were solicited from Syracuse University' and other 
insti tu.tlonjs which were likely to have Individuals wl^h 
information needs related to the content of the two data 'bases. 
Our objective in accepting users was to come as close^ as possible 
to criteria used in operational search services so that queries 
and releva^nce judgements could be plausibly generalized. 

. Originally, the study design specified 98 users for .Phase I 
and 60 for Phase II., Each user was to submTt a single query. 
However, because of the di f f icul ty 1 n ' obtaijii ng users, several 
users were permitted to submit more than one query. The number 
of users, their characteristics, and the number^ of queries for 
each Phase of the study^are given in Tables 2 and 3. 



E. Relevance Judgements 

•A 

Relevance judgements were obtained from the users for all 
documents retrieved for the query.** A four, point scale .was used 
with "1'* and "2" indicating relevant, "3" and M^" . 1 ndi cati ng 
non-relevant. The instructions which 'accompanied the search 
results are provided in Appendix B. * 



*One searcher left the project after completing 42 queries. The 
remaining queries werfe searched by a fifth intermediary who ^had 
the requisite experience and wa-s trained for this study. 

.**After repeated attempts, four users in Phase I did not return 
their relevance judgements. In these few cases we identified 
other individuals in- the specific topic area .of the query who 
presumably could mdke relevance judgements. Thes^ surrogate 
users made the reV'evance judgements. 



I 



Table 2 

Q^aracterigtics of Users in Phase 3; 
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Affiliation 



No. or ' Sci/ , ' of 

Users-Fac\ilty-Students-Eng-Ot±iers-Queries 



"Syracuse U. 

dene^ral 
:i^ctric 




Univ^ of 
Illinois 

Univ .of 
Louisville 

National 
Bureau of 
Standards 

OCLC^nc. 

En^ron. 

Protection 

Agency 

QTISCA 

Industries 

« 

SUNY, College 
Environ. 
Sciences & 
Fore^stry 



Total 



35' 
1 



,6 

5 
6 

1 

1 



69 



26 
0 



0 
0 



28 



8 
0 

^ 

0 



0 
0 



0 

1 

0 ' 



0 
0 



5 



6 

6 



12 



18 



11 -'h:-S0:^W-f 
Si-.-'- ■ ■ 



Altogether, 69 individuals served as users in this study. 
11 of these individuals' submitted more than one query; 
8 users submitted 2 queries, 2 users submitted 3 queries 
and 1 User submitted 4 queries. 
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Table 


3 ^ 






Qxaracteristics o| 


Users in 


Phase II 




Af filiation • 


^# of 
Users - 


- 

Faculty - 


Students 


• 

s 

- others - 


# of 
- Queries 


Syracuse 
University 


39 


11 


28 


m 


44 


Utica 
College 


1 -V 

f 


1 


0 




1 * 


Madison 
L*o luin U.I1 i uy 
Services 


1 


0 


01 


1 • 


1 


Social 
Service 
Dept QCC 

BMW 

Cooperative 


1 


1 

0 


0 


0 

1 


3 


University 
of Illinois 


1 


, 0 , 


0 


1 ' 


1 


SUNY 
Albany 


1 


0 


0^ 


1 


1 


Total 


4S 


13 


28 


4 


52 



Altogether, 45 individuals served as users in this study. 6 
these individuals s\jbmitted more than 1 query, 5 users subinitte_d 
2 queries, and 1 use-r sxjbmitted" 3 q;ueries . 
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V. METHODOLOGY 

" (A, Variables ^ ' 

The key experimental or independent variable was the 
representation used in searching the data base. Seven 
representations were used in Phase I, four were used in Phase H. 
The representations are described in Table 4. , 

The major dependent or criterion variables were performance 
measures (recall and precision), measures of overlap, and the 
'total number of documents retrieved were also analyzed. These 
measures were 'operati onal i zed ' as follows. 

Recall : The recall ratios were formed by dividing the 
. number oT^relevant documents retrieved by each representation by 
the total number of relevant documents retrieved by all of the 
representations.* Both "macro-" afid "micro" recall ratios were 
' used (Salton,. 1968. p. 299). Macro- .(or "user") recall is 
computed by taking the average of the recalls cal cul ated> f or each 
. jquer>. Micro- (or "system") - recall totals the number of 
retrieved relevant do'cuments across all queries and then divides- 
that total by the sum across queries of all relevant documents. 

P reel si on : The preci si on ^rati o was formed by dividing the 
number of rel evant documents retrieved by each representation by 
the total number of documents retrieved by that representation. 
Both macro- and micro- versions of precision were computed. 

^ > 

Total -Retrieved : This measure is simply the number of 
documilTts retrieved by each representation; it is the 
denomi nator -of the precision ratiq. It wais included because it 
is an indication of user effort required to read the output from 
the system. 

' *Durf ng Phase TT another research investigation made use of a 
stemifted representation (similar/to, bHJt not identical ^with, the 
ST representation used in Phase I). Documents retrieved by this 
"fifth" representation were also judged for relevance by the 
user. The denominator of the rec^l ratios used in Phase II 
include relevant .documents * --retrieved by the stemmed 
representation as,^ell as the four major representations. No 

' analysis Df the s'tlmmed representation for Phase II is, included 

in this report." It should be noted, however, that the stemmed 

representation retrieved relevant documents not retrieved by the 

other four representations. 

* 
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\ Table ,4 
Document jF^epresentation^ 






Abbreviation 


Description . 


Use 




DD 


— f 

Descriptor terras chosejst 
by an indexer; a 
controlled voc^ulary. 


"DV* ^ a T T T 
irn aS6b -L. « 




AA 

< 


Free-text words from 
the abstract; 1>rivial ^ 
• words excli^ded. 

' * ^ 




i 


T'f 


Free-text words from 
* the title; trivial 
words excluded . 


Xril CiSD CO X- Qt -L J. 

i 




II 
DI 


Free-text phrase^ 
.chosen by an indexer. ^ 

.VlectPd terms 
\ Indexer seiecuea uemis . 

A compound representation 

made up of DD and^II. 


i 

. Phases I & II 

pV\ T 

/ 




ST 


* 

A stemmed version 
(automatic suffix removal) 
of representation TA. 




TA 


Free-text terms from tfhe 
title and abstract. A 
compound representation 
made up of TT and AA. 


Phase I 
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Asymiftetr 1 c-0ver1 ap ; For , two representations i and j , . thi s 
\ measure.' 1 s comp'uted by dividin'g the number of documents retrieved 
'by both representations by the 'number retrieved by one of the 
represe:ntati ons . If R^^ and Rj are .the sets of documents 
» retrieved by repres^entati ons i and' j, then the. 
asymmetrical— over! ap me.asure 'can simple be given as 



n[R.^ n R. ] . - . ' • 

^ • A. . = 1 T_ 

1 

where "n" is the cjouriting operator. S6en this way, 
asymmetcical -idverVap is the conditional probabi 1 i ty of retriev^al ^ 
using representation j given that the data base is restricted to 
those retVieved by representation i. 

Symmetri c-Overl ap : For two representations, i. and j, this 
measure is computed by dividing th^ titfmbe^r of documents retri eved 
in common by both- /Representation's by the total number of 
different documents retrieved by eiirher. . Or mare formally, it is 
the number 'of retfievea documents iii^the intersection of the two 
representations divided- by jthe number r^e.trieved by the union of 
thos^representatjons. v " ^ 

■r 

s.. = ^^^i ^j.^ 
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n[R^ u Rj] 



Um'on-Overlap : For two representations i and j, this 

measure is computed by dividing the number of documents retrieved 

tjy either of the representations by the number of do-cuments 
f-etrleved by all r rejpresentati ons . 

I 

. n [R. u R. ] 
U • • = 1 3^ ' 



n [R^ u Rj u . . . u ^ R^l 



\ 

Thus, the union-overlap is more of a recall ratio for -a 
comb^ination of representations. It can be extended to 
combinations of more than two representations by expanding the 
numerator. 
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Different versions of these dependent variables were 
computed; they differed in terms , of the stri ngencr of the „ 
relevance criterion'. In both Phases of this investigation 
relevance was determined by the requestor. A [o"':. JJ^"; 
continuum'was used from 1 (definitely relevant " Jf-^^ 

not • relevant): Some analyses are based on a "strict definition 
of relevance: only those judged "1" were, included, ^ther 
analyses used a dichotomized relevance judgement and a br(Jader 
definition of relevance was^used: . those documents judged wit-Ji 
*1" or "2" were ac-ceptable. Lastly some analyses ar^ based on 
all retrieved documents; relevance was not taken into account. 

'These alternative versions of the dependent variables arve 
identified by an- appended suffix. For example •^^^^'J:^'. 
Precision-1. Overl ap-1 . etc. are all base^ on the .stricter 
definition of relevance^. those measures with a suffix, i. ^a.re^ 
based on the broader definrtion. . •' 

( . ■ . ■ • ■• 

B . Procedure w . 

Queries obtained from u.sers (see Appendi'x C for Directions 
-to Users)- were used as submitte^l; they were not screened for 
appropriateness to the data b'ase or far 0"-l i^^^ ^earchi ng i n 
Phase I; some screening was used in Phas^ II. Each intermediary 
was given a photocopy of the search request. In Phase I. eacn 
^.intermediary used a different representation to search each 
query, and across all the queries each intermediary used each 
representation an equal number j^f times. In Phase II. each 
intermediary searched each query four times using all four 
representations. In 'both phases, computer programs within the 
DIATOM system controlled the order that, representations were 
used: according" to the Latin Square Design in Phase I and^ 
randomly in Phase II (see Appendix E). 

Search intermediaries us^ the DIATOM system to retrieve 
documents. I nfermedi ar i es were instructed to carry out 
"high-recall" searches. The directions given to eacn 
intermediary is provided in«Appendix D. 

After a query was completely searched (seven times in Phase 
I sixteen times in Phase II), the retrieved document set was 
merged into a single listing and placed in reverse ^Jl^ 
order. This listing consisted of the citation^s and abstracts of 
the retrieved documents (if more than 200 documents were 
retrieved, a random sample of 200 was used).' No clue was present 
which indicated either the intermediary or the representation 
used to retrieve the document. 




V 
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Two copies of this listing were produ^. Both copies were 
sent /to the user with; instructions <<^see. Appendix, B) to make 
reVevance judgements on one copyjwhich, wa's to be returned to the 
project, the second copy was for\he user;. 

C. Design and Analysis • t 

. /- 

The- ^measures of . mac ro- rec al 1 , macro-precision and 
total -retrieved were analyzed using standard analys.is of variance. 
(AOV) computations. The design and the analysis can control for 
extraneous variables and can dentUy separate effects for, the 
refJ^esentations , intermediaries, and other components of the 
'Study, including i n terV-ac ti on effects if desired. 

In Phase I, the overall design can\be characterized as a 7x7 
Latin Square replicated 12 times (hence 84 queries). The L'^atirt 
Squares used in this study are given in Appeadik E. The 
partitioning of the total variation can be determined from the 
various AOV Summary Tables given in Appendix F. 

Approximately ten percent (66) of the preci sion resul ts had 
to be excluded from the analysis because no docunjents were 
retrieved for a given query under a given representation. 
Fourteen queries had to be excluded from all Recall-1 analysis, 
and se-v-en from the Reca1'l-2 analysis, because in each situation 
no relevant documents were retrieved. 

In Phase II, the overall design can ' be described as a 
factorial design containing sixteen cells (four searchers by four- 
representations). Each of 57 queries was searched under all 
sixteen combinations. This desjgn, in contrast with the Latin 
Square design used in Phase I, required that each- intermediary 
use all representations when searching a quecy -- thereby 
enabling us to determine if representation effects interacted 
with intermediary effects. ^ 
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RESULTS 



1 n 
are 



/ ■■ 
Our initial concern was to determine if the results from 
this study repeated the pattern ^oted earlier: rel ati vely li ttl e 
difference in performance among the representations coupled with 
relatively little overlap. Table 5 presents these results. it 
is apparent that these results do repeat the pattern observed 
"other studies. Though* some performance measures 
significantly different, none of the differences exceed .18 -- 
which is clearly within the range of values reported rn the 
lite^rature. The overlaps range from a low of about 14% to a high 
of about 27%; these al 50_correspond to the. earl i er resul ts . 

The remain'ing part of. this section presents these .findings 
in more detail. First the performance measures will be 
consider^.* Then the study of overlaps will be presented. 



A. Analysis of Performance 



of recal 1 , 

in terms 
two studies 
i ntermedi ary 



precision, and 
. of document 
also, analyzes 
differences iind 



The macro-performance measures 
total -retri eved are analyzed 
representations. The design of the 

macro-performance in terms., of search . , 

(in Phase II) an interaction between searchers and 
representations. If interaction effects existed, J"^ ^^"^l'^^^^ 
discussion of document representations would have to be tempered 
by theif relationship with intermediary effects. Fortunately, 
that did not turn out to be necessary: the Phase ^^"J^^f^^ 
(Appendix G) indicate an absence of searcher/ "representation 
interaction. Furthermore, the results show that sfearcher effects 
did not consistently appear: they were sizeable^ in Phase I and 
much smaller in Phase II (Appendix F and G. 



the macro-performance 
The macro-performance 
statistically significant differences 
for the AOV Summary Tables). A listing of 
the significant differences can be found in Table 7. It must be 
stated at the outset that there are some major differences in the 
results of the two Phases and consequently ,they 
discussed separately. 



Descriptive summary st.ati sties for 
measures are presented in Tables 6 and 7 
means were presented for 
( see Appendi x F and G 



need to - be 



2; 
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Table 5 



> . o4rlaps -Ainonq "Best"- and "Worst" Performing .^^presentations ' 



"Best" ' .. "Worst" 
Performing Performing Symmetric 
Represent. "kepresent. . Dif fererice Overlap*** 

' ' ' ^ ^ ^ , — 



Recall- 1 
Recall-2 
S Precis i6n-l 

A 

Precis ion- 2 



40 4 
321 
264' 
,422 



229 



.200 
.173 



. 336 



175** 
121** 

,091' 
,0 86 



,155' 
,138 
,172 
.150 



M 
M 

0) 

CO 



Reqall-1 



R^call-2 



_2 Precieion-l 
On r 

Precis ion-2 



263 
.242 
.282 
..539 



. 17,9 
.153 
.219 
.416 



,084** 
,0 89** 
.063 
.123** 



.264 
.234 
.273 
.256 



*Macro -performance measiares . aafe taken from Table 6. 
**Difference statistically significant at .0 5 level, 



** * 



Symmetric -overlap figures' are taken from Tables 9 and 12^ 
using the pairwise o.yerlap between the "best" and worst 
performing representation. 



ERIC 



2. 



Page 18 



Table 6 



< 

\ 

1 

• 


i \ 

DD 


AA 


• 

'TT 


— 1 

f 

II 


' Dl 


ST . 


TA -• 




Recall- 1 


. 229 

.t70)- 


.365 
(70) 


.273 
(70) 


.339 
(70) 


■ 330 
(70) 


.39 2 

no) 


,.404 - 
C70) N 




Recall-2 


.200 
(77) 


. 270 

(77) ' 


.205 
(77) 


..321 
(77) 


.284 
(77). 


- . 317 

C77) 


.290 

(77). ,, ' 


M 

0) 


P reels ion- 1 


.173 
(6i) , 


.19 7 

(77) 


.264 
C70) 


.218 

(79) 


.221 

(75)- 


.188 
(81) 


.224 - 
C78) . 


in 
m 
•x: 
a< ^ 


Precisior^-2 ^ 


.336 
(62) 


.352 
(77) 


.4 22. 
C70) 


.4Q3 
,(79 ) 


.361" 

( n tL\ 


" .338 


.352 . 




Total-Ret;r. 


13.2 

. (84) 


17. 5 
C84) 


12.4 
C84) 


16.1 
(84) 


1,6.4 


19 .8 


18,6 




Recall-1 


. 263 ' 
(17 6J 


.256 

(177) 


.179 • 
(177 ) 


.20 5. 
(179) 










Recall-2 


. 242 
(176) 


. 213 

(177) 


.153 
C177) 


>.182 

C179) 








se II 


Precis ion- 1 


. 28 2 

(176) 


.219 

(177) 


.276 
U77) 


.255 
C179) 




i 


' • -- ■ ' I 


Pha 


Precision-2 
< 


.532 

(^76) 


.416 
0177) 


.539 

(177) 


\500 
'(179) 










> 

Total- Retr. 


18.6 

(176) 


17.9 
C17 7) 


10 .3 

C177) 


' 12.6 
(179) 













J" -- . 



• 1 
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Table 7 

Significant Differences in Macro-performance Among Representations > 





Repres'entation 
Poorer Better 


Average 
Difference* 


Percent 
Improvement 




lie call- 1 


DD 


.TA 


■ X .17 5 


76% 






DD 


ST 


.173 


. 71% 




• 


DD 


AA 


.136 


59% 


1 1 


Recall-2 


DD 


II 


.121 


60% 


Phase ] 




DD 
TT 


ST 
II 


. 117 
.116 


C Q Q. ' 
D O % 

c ^ ^ 

D b % 






. TT 


ST 


.112 


55% 




Precision-1 ^ 




r 


— (— 






Precis ion- 2 












Recall- 1 


TT 




• 

.£84 


47% 






TT 




.0 77. 


43% 


" Phase II 


Recall-2 
Precision-1 


TT 
TT 
II 


DD 

b 

AA 
DD 


.089 
.0 60 
.0 6'0 


58% 
39% 

a3% 




Precision-2 


AA 


TT ■ 


.123 


30% 


< 




AA 


DD 


.116 


28% 



♦Differences a^e significant at .05 level using Tukey's HSD 
procedure. See Appendix F and G for de^fails. 



Page 20 



For Phase I results, represeatati ons differed significantly 
in (macro- Recall-1, Recall-2, and Total -Retri eved ) -scores. As 
indicated in Table 7, descriptors (DD) and titles (TT) performed 
rather poorly as representations on the recall measures, while 
identifiers (II) and title-abstracts (either TA or ST) performed 
much better. . * 



Even though no pairs of' representations.' differed 
significantly in , either precision measure, it is useful to 
include some consideration of precision into these findings. 
Considering all five measures^ the descriptor ( DD ) repre sentati on 
performs uniformly poorly on the recall and precision measures 
while title-abstract (TA) performs reasonably well on them -- 
though not as strongly as DD's negative performance. 
Interestingly, the free-text words, assigned by indexers (11) 
perform moderately well ov.er all" five measures. Stemming (ST) 
which would tend^'to incrqa'Se the total number retrieved performs 
quite well on the recall measures, but pOorly on the precision 
measures. The- title repiresentati on (TT) shows the opposite 
pattern -- hig.h on the precision measures (and Tot-Ret.) and low 
for recall. The other representations fluctuate quite a bit over 
the five measures. 

For Phase II the patterns of results are for the most part 
different. One important exception is titles (TT) which perform 
poorly here in terms of recall as in Phase I. The major 
•difference between the two phases has to do with the relative 
performance of descriptors (DD) and free-index phrases (II). In 
Phase, I, the irtdex phrases perform much better than the 
descriptors, which in Phase II their results are somewhat 
reversed. And, somewhat » surprisingly , this pattern occurs in 
terms of precision as well as recall. The precise cause of^ this 
reversal cannot be ascertained experimentally from the data 
collected in this study. Two possibilities should be considered: 
(1) the differences that exist between the two data bases 
especially in terms of specificity of terms, and (2) th'e 
differences that exist between the directions and training given 
the indexers at INSPEC and at Psyclnfo. 



Data base differences, however, are not likely to be the 
major cause of Phase II producing generally lower values in 
macro-recall and higher values in matf^'o-preci sion than the 
comparable results in Phase I. Instead, these general trends in 
macro-performance between the two Phases are probably related to 
differences in the design of the two studies. In both Phase I 
and Phase II, the numerator of the macro-recalls was based on the 
results of One .intermediary searching the data base once.' The 
two phases differed, however, in the denominators; in Phase I it 
was based on seven intermediaries searching the query once, while 
An Phase II the denominator was based on 16 searches (four 
-nntermedi aries each using all four representations.) Therefore, 
t/here was more opportunity to identify relevant documents for the 
-recall' denominator in Phase II, leading to a lower average 
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macro-recati . The wacro-pred s1on figures could easily have been 
affected by searcTiing time. In Phase II each query had to be 
searched by an intermediary four times. Intermediaries may have 
reduced the search time so that the total time allotted to each 
query was comparable to the time spent in Phase I searches. To 
the extent that relevant documents are mpre likely tp be 
retrieved early in the search process, the obtained higher levels 
, of mat:ro-preci sion ffound in Phase II 'can be attributed somewhat 
to decreased search times. * ^ 

For both of these reasons, the differences between the two 
Phases in terms of macro-performance should not be attributed to 
the differences in the two data bases. The fact that the 
"^micro-performance results discussed below do not present a 
similar pattern between the two Phases strengthens this position. 

The average micro-performance levels are reported in T^ble 
8.* Micro-performance addresses the issue of how well the 
representations can do when multiple searchers pool their 
results. It is a more conservative approach; as indicators of 
system-1 evel performance micro-measures are very helpful because 
they decrease the e f f ec t of s i ngl e (perhaps atypical) searches or 
queries. In general, the results noted in the , macro-perform*ance 
data are also evident here. For Phase I, the index phrases (11) 
perform quite well overall, while the descriptors (DD) do poorly; 
the reverse is true for Phase II. For Phase II the micro-recall 
figures are higher than those of Phase I. This finding' is much 
more intuitively reasonable than the macro-recall data suggest 
given the nature of the topics contained ip the two data bases. 
This, plus the, possible artifacts due to design (note^ above) 
makes the micro-recall figures for Phase II better indicators of 
the recall obtained in that study. 
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^Because statistical inferential tests were not calculated on any 
of the micro-perf ormanc6 measures, it is not known if the 
observed differences are larger than what could be expected to 
occur [by chance. 
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T ab 8 







Micro- 


performanqe 


Means 






- 






r-DD 


AA 


TT 




D,I 


ST 


TA 




ilecall-1 


.237 


.328 


- . 285 


.34 8 


.309 


. 30 4 


.369 


M 


Recall- 2 


. 216 


v283 


. 229 


.30 6 


.268 


.281 


.294 


cn 
x: 


Precis ion-" 1 


.173 


.181 


. 221 


.20 8 


.182 


.148 


.19 2 


Precision-2 


. 335 


.332 


. 378 


. 389 


. 336 


.291 


.324 

^ 




Recall-1 


.520 


.475 


. 322 


.351 








M 
M 


Recall-2 


.526 


.440 


. 313 


. 350 








Phase 


Precis ion- 1 
Precision-2 


.133 
*. 340 


.120 
.283 


. 141 
. 34 7 


.122 
. 309 









It 

> 



Er|c ^ o'J 
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B. AnalysisofOverlaps 

The simplest analysis of overlaps is pairwise, comparing 
each representation with every-oth^'er representation. Tables* #r 1 1 
report the overlaps for Phase I data; Tables 12-14 for Phase !!• 
Each tab.le contains three overlap analyses: (1) most relevant 
documents, (2) all relevant documents, and (3) ,all documents 
retrieved. - In these tables, a^ high valQe i ndi cates greater 
overlap and therefore less of an independent contribution of the' 
"second" representation. 

^In both Phases, the pairwise overlaps decrease as the number 
of, docufnents under consideration Increase. That is, the average 
overlap is highest ' when only most relevant ^ documents are 
considered; it- is lowest when all retrieved documents are 
i*ncluded. A second general finding is' that the ^overlap figures, 
are lowest when overlap is defined symmetrfcal ly , they are the 
hi'ghest for the union oveVlap. Thi$, of 'course, is a function of 
the definitio-n of -the three measures of overlap. And, there is a 
difference bellween the results'of the two Phcises. The average 
overlaps in Phase I are consistently lower than the corresponding 
averages for Phase II. At least part of this difference between 
the Phases is due to the different designs used. In Phase II, 
the design should have had a systematic effect of' raising the 
overlaps first by excluding .a searcher-representation 

i nt^eraction , and second by using the same intermediaries (with 
their individual understanding of* the queries) to search each 
que ry on all four* representations. 

.The major finding in these data is that the overlaps are 
quite small as indicated by the averages. For examples the 
highest symmetric overlap among the relevant docume'nts is only 
about one-third -- .313 between ST and AA i.n Phase I, and .363 
between AA and II in Phase II. 

The low overlap between index-phrases and either titles or 
abstract terms can in part be attributed to the fact that 
indexers may have selected the II phrases from the 'body of the 
document, not from the title or abstract. But', in general, there 
is not any single or sirtiple procedural explanation for these 
findings. Overlaps werse even low between representations that 
should have retrieved very similar documents. This can be. seen 
most clearly in the Phase I results by comparing the simple and- 
the compound representations such as abstract (AA) and 
title-abstract (TA) or descriptor (DD),a'nd descri ptor-i deati f ier 
(DI). One possible explanation for the small overlaps is 
searcher differences;' which is the only possible explanation for 
low overl ap s be tween simple and compound' representations. But, 
as an explanation for the low overlaps among all representations, 
searcher differences are not likely to be the major cause even 
othough the analysis of vari ance. tabl es (see Appendix F and G) 
show that searcher effects occasional ly. account for significant 
portions of the variance. It is the data in the ranking study 
(McGill, 1979) that cast doubt on the contention that searchers 



Page 24 



are the sole'or major cause gf the low amount of overlap. In the 
ranking study, overlaps between different ' repr'esentati ons , 
searched by the same searcher only equalled 14% for retrieved 
.documents. That figure certainly falls in the range of values 
reported here. Furthermore, the Phase II destgn required that 
each intermediary search, each query under all. representations; 
the overlap resul ts were", at best, moderate. 

In the symmetric measures (Tables 9 and 12) there is 
considerable consistency across representations --especially 
when the inflating effect of the three compound • representations 
in Phase I are excluded. In both Phases the maximum" di fference 
in overlaps does not exceed 0.10. Also, the free-index phrases 
(II) in both Phases show a tendency to share more relevant 
documents w-ith title and abstract fields than- with the descriptor 
field -- al'though the size of this overlap is still quite small. 

The asymmetric measures indicate the proportion of documents 
that would have been retrieved "anyway" -- that is, by the other 
representation. For example. Table 13 "reports an asymmetric 
overlap of .378 between DD and II for the most relevant 
documents. This c^n Be interpreted as follows: of all the 
documents retrieved by the de-script-or representation, 
approximately 38 percent of them can also be retrieved by the 
free-index phrases.^ Tables 10 and 13 provide both row and column 
average figures (the other tables are symmetrical and a single 
set of averages suffices). A useful interpretation of the 
difference between' row and column averages for a single 
representation can be given in terms of the sequence the 
representations are used in searching. The averages of the 
columns of numbers (presented along the bottom of the table) can 
be interpreted in terms of -being used "first" in the search 
process. Given a single represe'nta tsi on (indicated by the column 
heading), the average at the bottom indicates the proportion of 
documents retrieved by this representation that could also be 
retrieved by other representations. The averages presented in 
the right column are understandable in terms of being used last 
in the search process. Given retrieved documents from other 
representations, the row average for a given representation 
indicates its effect if searching were resumed using it alone -- 
tne lower the average, the more the new representation will 
contribute. 

Given this distinction between using (or implementing) a 
representation "first" or "last", "the asymmetric overlaps (in 
Tables 10 and 13) present a rather^^-^ns i stent picture -- 
especially for the most relevant documentS»v In Phas-e'I, either 
descriptors or free-index phrases are slightl> the best choice 
for "first" use; in Phase II it is clearly the descriptors. For 
"last" use, the data indicate titles in Phase I and descriptors 
again in Phase II. The distinction between first -and last use of 
a representation wi 1 1 be important in the- next section of this 
report. 
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Union overlaps presented in Tables 11 and 14 give an 
estimate of the combined effect of two representations; they are 
conceptually equivalent to the recall ratio for the two 
representations, .Because the numerator of these ,pai rwi se union! 
overlaps includes all distinct documents (in the appropriate 
version) retrieved by two> representations, the union^'overl |p^s 
will have higher values than comparable figures for | the 
symmetrical and asymmetrical overlaps. I^n pri nci pi e , ;": the 
diagonal elements in the union overlaps should be identical; to 
micro-recall values presented in Table 8. And, that is true for 
Phase I data. However,- as noted earlier in this report. Phase II 
mi c ro- recal 1 s were based on five representations (the fifth 
one was produced for another research investigation) while the 
overlaps in Table 14 are based on retrievals from four 
representations hence the discrepancy. 

The union overlap results from Phase I s^ows that most pairs 
of representations achieve , at least 5Q percent recall levels, but 
not much higher. In contraslE, the Phase 'II figures are higher; 
All pairs of representations ( of f-di agonal s ) provide over 50 
percent recall and the combination of descriptors and abstracts 
gives over 80 percent of the most relevant documents and over 75 
percent of all documents retrieved. 

Union overlaps are on§ way to explore "marginal utility" or 
the "value added" of additional representations. Tables 11 and 
14 provide only pai rwi sefc overl aps. The extension to more than 
two representations is necessary in order to get overall 
conclusions. The next section of this report takes this 
approach. • '^^ 
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Table 9 

Symmetric Pafywlse Overlaps - phase I 



AA 




TT 


TA 




ST 


* 


i; 




DI 




DD 


AVG * 


Version - f^ost 


Relevant 




















AA 1.000 


0. 


181 


0.270 


0. 


313 


0 


. 212 


0. 


217 


0. 


125 


.220 


TT • 0.181 


1. 


000 


0.227 


0 . 


178 


' 0 


. 236 


0. 


209 


b. 


172 


. 200 


TA 0.270 


0. 


227 


1.000 


0. 


307 


0 


. 208 


0. 


236 


0. 


155 


.234 


ST 0.313 


0. 


178 


0.307 


1. 


000 


0 


. 179 


0. 


201 


0. 


115 


. 215 


II 0.212 


0. 


236 


0.208 


0 . 


179 


1 


.000 


0. 


314 


0. 


173 


. 220 


DI Q.217 


0. 


209 


0.236 


0 . 


201 


0 


.314 


1 . 


000 


0 . 


270 


.241 


DD 0.125 


0. 


172 


-0.155 


0. 


115 


0 


.173 


0. 


270 


1. 


000 


.168 . 


Version - AH 


Relevant 




















AA ' 1.000 


o; 


141 


0.215 


0. 


235 


0 


.167 


0. 


186 


b. 


112 


.176 


TT 0.141 


1. 


000 


0.154 


0. 


133 


0 


.173 


0. 


172 


0. 


150 


.154 


TA " 0.215 


0. 


154 


1.000 


.0. 


245 


0 


.167 


0. 


173 


0. 


114 


.178 


ST 0.235 


0. 


133 


0.245 - 


1. 


000 


0 


.138 


0. 


.137 


0. 


081 


.161 


II 0.167 


0. 


173 


0.167 


0. 


138 


1 


.000 


0-. 


242 


0. 


138 


.171 


DI 0.186 


0. 


172 


0.173 


0. 


137 


0 


.242 


1. 


000 


.0. 


258 


.195 


DD 0.112 


0. 


150 


0.114 


0. 


081 


0 


.138 


0. 


258' 


1. 


000 


.142 



Version - All Documents 



AA 


1. 


000 


0. 


064 


0. 


148 


0. 


138 


0. 


112 


0. 


103 


0 


.046 


.102 


TT 


0. 


064 


1'. 


000 


0. 


072 


0. 


057 


0. 


086 


0. 


080 


0 


.068 


.071 


TA 


0. 


148 


0. 


072 


1. 


000 


0. 


156 


0. 


096 


0. 


092 


0 


.052 


.103 


ST 


0. 


138 


0. 


057 


0. 


156 


1. 


000 


0. 


077 


0. 


063 


0 


.033 


.087 


II 


0. 


112 


0. 


086 


0. 


096 


0. 


077 


1. 


000 


0. 


131 


0 


.063 


.094 


DI 


0. 


103 


0. 


080 


p. 


09^ 


0. 


063 


0. 


131 


1. 


000 


6 


.120 


.098 


DD 


.0. 


046 


0. 


068 


0. 


052 


0. 


033 


0. 


063 


0. 


120 


1 


.000 


.064 



Averages Vere computed with the diagonal element omitted. 
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Table 10 



Asymme tric' Pairwise Overlaps** - Phase I 
— A 



AA 



TT TA ST II DI DD AVG .* 



Version - Nost Relevant 

AA 1.000 0.329 

TT 0.286 1.000 

• TA 0.451 0.424 

\ &T 0.459 0.312 

) II 0.361 0.424 

/ DI 0.346 0.359 

^ 0D O.i.92 0.268 

AVG 0.349 0.353, 



9' 


.401 


0. 


496 


0. 


340 


0 


.368 


0. 


266, 


0 


.367 


0 


.328 


0. 


293 


0. 


348 


0 


.332 


0. 


323 


0 


.318 


1 


.000 


0. 


520 


0. 


355 


0 


.420 


0. 


344 


0 


.419 


0 


.428 


1. 


000 


0. 


284 


0 


.332 


0. 


234 


0 


.341 


0 


.334 


0. 


325 


1. 


000 


0 


.508 


0. 


365 


0 


.386 


0 


.351 


0. 


337 


0. 


450 


1 


.000 


0,. 


490 


0 


.383 


0 


.221 


0. 


183 


0. 


248 


0 


.376 


1. 


000 


0 


.248 


0 


.344 


0. 


359 


0. 


338 


0 


.389 


0. 


337 







Version - All relevant 



1 AA 


1. 


000 


0. 


276 


0. 


348 


0. 


381 


0 


.275 


IT 


0. 


223 


1. 


000 


0. 


237 


0. 


212 


0 


.258 


TA 


0. 


361 


0. 


304, 


1. 


000 


0. 


402 


0 


.281 


ST 


0. 


379 


0. 


261 


0. 


385 


1. 


000 


0 


.233 


II 


0. 


297 


0. 


344 


0. 


292 


0. 


254 


1 


.000 


DI 


0. 


305 


0. 


319 


0. 


283 


0. 


235 


0 


.366 


DD 


0. 


178 


0. 


253 


0. 


178 


0. 


132 


0 


.207 


AVG 


0. 


291 


0. 


293 


0. 


287 


.0. 


269 


0 


.270 



0.323 0.233 0.306 

0.274 0.268 0.245 

0.310 0.241 0.31$ 

0.247 0.172 0.279 

0.418 0.292 0.316 

1.000 0.458. 0.328 

0.370 1.000 0.220 

0.324 0.277 



Version - All Documents 

AA 1;000 0.145 

TT 0.103 1.000 

TA 0.265 0.169 

ST 0.259 ^0.141 

II 0.193 ^0.182 

DI 0.180 0.172 

DD 0.'078 0.131 

AVG 0.180 0.157 



0. 


250 


0. 


229 


0. 


210 


0. 


193 


0. 


103 


0. 


188 


0. 


113 


0. 


088 


0. 


140 


0. 


131 


0. 


123 


0. 


116 


1. 


000 


0. 


262 


0. 


188 


0. 


IBO 


0. 


119 


0. 


197 


0. 


279 


1. 


000 


0. 


159 


0. 


131 


0. 


080 


0. 


175 


0. 


163 


0. 


129 


1. 


000 


0. 


230 


0. 


131 


0. 


171 


0. 


158 


0. 


108 


0. 


233 


1. 


000 


0. 


240 


0". 


182 


0. 


085 


0. 


053 


°! 

o: 


108 


0. 


194 


1. 


000 


0. 


108 


0. 


175 


0. 


145 


173 


0. 


177 


0. 


133 







* Averages were computed with the diagonal element omitted. 

** The representations in -the columns form the denominator of 
the overlap measure. 
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Union Pairwise Overlaps - Phase I 
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AA TT TA ST II DI DD AVG. * 



Version - lost Relevant 



Mi 


0. 


328 


0. 


520 


0. 


549 


0. 


481 


0. 


558 


0. 


523 


0. 


502 


0 


..522- 


TT 


0. 


520 


0. 


285 


0. 


533 


0.. 


500 


0. 


512 


0. 


491 


0. 


446 


0 


. 500 


TA 


0. 


549 


0. 


533 


0. 


369 


0. 


5 J. 5 


0. 


594 


0. 


548 


0. 


525 


0 


.544 


ST 


0. 


481 


0. 


500 


0. 


515 


0. 


304 


0. 


553 


0. 


510 


' 0. 


485 


0 


. 50 7 


II 


0. 


558 




512 


0. 


594 


0. 


553 


0. 


348 


0. 


500 


0. 


499 


0 


.536 


DI 


0. 


523 


0. 


491 


0. 


548 


0. 


510 


0. 


500 


0. 


309 


0. 


430- 


0 


.500 


DD 


0. 


502 


0. 


446 


0. 


525 


0. 


485 


0. 


499 


0. 


430 


0. 


237 


0 


.481 



Version - All Relevan.t 



AA 


0.283 


0.449 


0. 


475 


0. 


457 


0. 


505 


0. 


465 


0. 


449 


0 


. 467 


TT 


0.449 


0.229 


0. 


453 


0. 


451 


0. 


456 


0. 


424 


0. 


388 


0 


.437 


TA 


.0.475 


„0.453 


0. 


294 


0. 


462 


0. 


514 


0. 


479 


0. 


458 


0 


.474 


ST 


0.457 


0.451 


0. 


462 


0. 


281 


0. 


516 


0. 


483 


0. 


461 


0 


.•4 72 


II 


0.505 


0.456 


0. 


514 


0. 




0. 


306 


0. 


462 


0. 


459 


0 


.485 


DI 


0.465 


0.424 


0. 


479 


0. 


483 


0. 


462 


0. 


268 


0. 


385 


.0 


.4 50 


DD 


. 0.449 


0.388 


0. 


458 


0. 


461 


0. 


459 


0. 


385 


0. 


216 


0 


.433 



Version - All Documents 



AA 


0. 


220 


0. 


353 


0. 


395 


0. 


412 


0. 


380 


0. 


386 


0. 


369 


0 


. 382 


TT 


0. 


353 


0. 


156 


0. 


363 


0. 


384 


0. 


331 


0. 


335 


0. 


302 


0 


.345 


TA 


0. 


395 


0. 


363 


0. 


234 


0. 


418 


0. 


398 


0. 


402 


0. 


380 


0 


. 39'3 
.411 
.373 
.374 
. 355 


ST 


0. 


412 


0. 


384 


0. 


418 


0. 


249 


0. 


420 


0. 


428 


0. 


402 


0 


II 


0. 


380 


0. 


331 


0. 


398 


0. 


420 


0. 


203 


0. 


361, 


0. 


347 


0 


DI 


0. 


386 


0. 


335 


0. 


402 


0. 


428 


0. 


361 


0. 


206 


0. 


332 


0 


DD 


0. 


369 


0. 


302 


0. 


380 


0. 


402 


0. 


347 


0. 


332 


0. 


166 


o 



* Averages were computed with the diagonal element omitted 
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Tabl e 12 

Symmetric Pairwise Overlaps — Phase II 





II 


DD 


AA 


TT 


AVG * 


Version 


- Most 


i 

Relevant 








DD- 

AA • 

mm 

TT 


1.000 
0 .289 
0.363 


0.289 i 
1.000 
0 .273 

U . ^ O *x 


;^ 0.363 
\0 t273 

r.ooo 

0.277 


0.351 
0 . 264 
0 .277 
1.0 00" 


0 . 334 
0.27 5 
0.30 4 
" 0.297 


Version 


- All 


Relevant • 








II 
TT 


1.000 
0 . 269 
0 , 319 
0.328 


0.269 
1.000 
6.233 ' 
0.2 34 


ff.319 

n 0 

1.000 
0.256 


0 .328 

n 9 "^4 

0 .256 
1.000 


0 . 30 5 
0.245 
0 .269 
0.273 


Version 


- All 


Documents 








i;e 

DD 

AA 
TT 


1.000 
0 .199 
0 .182 
0 .215 


0.199 
1.000 
0. 150 
0 . 159 


0 .182 
0 .150 
1.000 
0 .127 


• 0.215 
0 .159 
0.127 
1.000 


- 0.199 
0 .169 
0 .153 
0 .167 



*Averages were computed with the diagonal element omitted, 
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Table ' " ' - ' 
Asymmetric Pairwise Overlaps** — Phase II 



II 



DD 



AA 



TT 



AVG * 



Version - Most Relevant 



II . 
DD ■ 
AA 
TT 

AVG* 



1.000 
0 .552 
0 .616 
0.491 
0-.553 



0 .378 
1.000 
0.40.7 
0 .336 
0.374 



0 .469 
0 .452 

1.000 
0 .364 
0 .428 



0 .551 
0 .55,1 
^0 .536 
l.OOQ, 
0 . 546 



0 .466 
0 .518 
0 . 520' 
0 . 39 7 



Vers ion 

II 
DD 
AA 
TT 

AVG* 



All> Relevant 



1.000 
0 .524 
0 .54 
0.468 
0 .511 



0 
1 

0 
0 
0 



,357 
,000 
,348, 
, 305 
.337 



0 .437 
0 .413 
1.000 
0 .351 
0 .401 



0 .523 
0 .500 
0 .485 
l.„000 
0.50 3 



0 .439 
0 .479 
Q-*458 
0 .375 



Version - All Documents 

II 1.000 0.28,9 

DD 0 .39 1.000 

AA . 0.371 0.267 

TT 0.321 0.220 

AVG* 0.361 0.2 5^ 



0.264 0.394 0.316 

0.256 0 . 364 ^ 0 . 337 

i.OOO 0.307 0.315 

0.178 1.000 0.240 

0 .233 0 .355 



* Averages were computed with the diagonal element omitted. 

** The representations in the coliamns form the denominator of 
the overlap jxieasure. ^ ' ' 
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Union Pairwise Overlaps — Phase II 
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J 



II DD AA TT AVG 



Version - Most Relevant 

II 0.377 0.719 0.640 0.528 0.629 

X DD 0.719 0.550 - 0.821 0.701 0.747 

AA 0.64 ^ 0.821 0.495 0.651 0.704 

TT 0.528 0.701 0.651 0.336 0.627 



Version - All Relevant 

II 0. 368 0 .715 0 .624 0 . 525 0 .62.1 

DD " 0 .715 . 0 .539 . 0-.«06 ^ 0 .704 0. 742 

AA 0.624 0.806 ,0.454 0.624 0.685 

TT • .0 .525 " 0 .704 0 .624 . 0.329 0..618 



Version - All " Documents 

II 0.314 0.616 0.640 ' 0.469 0.575 

DD 0.616 0.424 0.753 - 0.587 0.652 

AA' 0.640 0.753. 0.442 0.619 0.671 

TT 0.469 0.587 0.619 0.256 0.558 



Averages were computed with the diagonal element omitted 
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VII. DISCUSSION 



•What are the factors which explain these findings? Are the 
results 'dimply due to chance variations or are there some 
systematic components that can be identified? This section of 
the'.=^ report respond? to these questions. First, differences in 
data bases and indexer instructions will be rev;yewed. Then 
different overlap models- of the dat« will be presented and 
explored from several viewpoints. 

A. DataBasesandlndexing 



As noted earlier, there are two. related factors that might 
have contributed to the differences in performance of descriptors 
(DD) and free-index phrases (ID in the two data bases. They are 
the differences in, ^;he indexing procedures used and the avowed 
purpose of the rept*esentati o«s in the data bases. Indexing 
procedures are not so much a function of the written indexing 
rules (though such rules exist, for example INSPEC, 1970) but are 
more a matter of what the i ndexerS^actual ly do. 



At INSPEC, indexer^ read the title and abstract while at 
PsychAbs, the indexers focus on the abstract only. Both groups 
of indexers then identify the main concepts of the document. At 
INSPEC- the concepts are taken in the form of the actual phrases 
used ift the document. To this*list of- phrases the INSPEC 
indexers add any concepts implicit in the document not already 
representated by the Selected phrases. .The, phrases plus the 
implicit concepts form the II representation. The descri-ptor 
terms. (DD) at INSPEC, are then generated from a thesaurus; the 
goal being to select terms that represent the concepts noted in 
the title and abstract. 

At Psychlnfo the indexers reverse this process. /i>st they 
use the thesaurus to select descriptor terms that best represent 
the concepts found in the documeat abstract. The free-index 
phrases are then generated from the abstract to provide 
supplementary information. For documents reporting experimental 
research the supplementary infornjation (in the. form of I'l 
phrases) further describes the details of the study -- 
information about the variables used and the subject population. 
For nonexperimental - or theoretical articles, • the free'^index 
(Jhrases are more general descriptions of the documents. 



Thus, t'o some extent there is a relationship between the II 
phrases used in INSPEC and the descriptors used in. PsychAbs. 
Both are generated from the document and more importantly, both 
attempt to capture the main concepts of the document. In 
comparison, descriptors assigned by INSPEC indexers may not 
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exhaustively capture all of the- concepts in the document because 
the procedure us^d misses implicit concepts and also because the 
descri p tors 'used at INSPEC were developed for a manual system and 
as a result are not as exhaustive as they could be. The 
identifier phrases in PsychAbs are not meant to exhaustively 
represe.nt all of the concepts in the document. For these 
reasons, we could expeclT^he descriptors in PsychAbs and the II 
representation in INSPEC to perform quite well in comparison with 
the other representations used in these data bases in their 
ability to retrieve relevant documents. 



Precision is a function of specificity. The II phrases used 
by INSPEC are for the most p^irt composed of the author*s own 
words and are therefore as specific as free- index terms. And, as 
noted earlier, the II phrases in Psychabs may be much more 
general. In PsychAbs, however, it is the descri pttr- field that 
is designed to be specific as well as exhaustive (APA, 1976). 



From' this analysis it seems possible that the (relative) 
superior performance of II in INSPEC and DD in PsychAbs in terms 
of both^ recall and precision may be a function of their 
similarity of purpose and the method by which they are produced: 
both are generated from the concepts found in the document and 
both aim at exhaustivity while maximizing the specificity of the 
terms selected. " . 

, B. Descriptive Models of Overlap 



Overlaps between pairs of representations were discussed 
earlier. The question of concern here focuses on the 
relationship among a 1.1 of the representations: what is the 
optimum ccmbinati.on of representations, or more precisely, the 
optimum ordering of representations. That is, if a retrieval 
environment were limited to a single representation, which one 
would it be? If a^ second could be added, which of the ^remaining 
representations contri bu tes the most over and above the effect of 
the first representation? A third representation could be added 
over and above the first two, and so on. 

The most sensible measure to use in answering this question 
is based on the union overlap.* Tables 15 and 16 present the 
results of this analysis. Table 15 uses all seven 
representations for the Phase I data and analyzes both the highly 
relevant as well as the total relevant measures across queries. 

*Uni on overl aps are recall estimates and ijie discussion in thfs 

section i*s based on these recalls only , precision is not 
considered. 

w . ' ■ ' ' ' 
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Table 15 



Representations Ordered by Incremental Improvement 



Order 



Phase I 






1st 


2nd 


It 

3rd 


4 th 


TA 


II 


AA • 


DD 


299 


444 


574 


656 


.369 


.548 


. 709 


.810 



-p > 

cn 0) 



Representation 
Cum. ^No. Docs . 
Cum. Percentage 



TT ST DI 
722^ 76 8 810 
.891 .948 1.000 



4J 
c 

> 
(1) 

•H 
(1) 



Representation 
Cum. No. Docs . 
Cum. Percentage 



II ST DI TA TT AA DD " 
527 889. ^1118 1318 1466 160-2 1723 
.306 . 516 .649 .765 .850 .930 ^1.00 
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Ta ble 16 

T* '-^ > 

Representations Ordered by Incremental Improvement 

Phases I* and^II> 



Order 



1st 



2nd 



3rd 



4th 



4J 
C 
to 
> 

Q) 

4J 
CO 
O 



M 

(D 
CO 

r—* 



M 
M 

x: 



Representation 
Cum. No. Docs. 
Cum. Percentage 

Representation 
Ctim. No. Docs. 
Cum . P e r c e nt age 



II 


AA 


TT 


DD 


282 « 


4,52. 


554 


634 


.445 


.713 


. 874 


1.000 


DD 


« 

AA 


TT 


II 


339 


50 6 


573 


616 


.550 


. 821 


.9 30 


1.000 




II 


AA 


DD 


TT 


527 


^ 870 


109 3 


1275 




.682 


. 857 


1.000 


Db 


AA , 


TT 


II 


871 


130 2 


1489 


1615 


.539 


.806 


.922 


1.000 



> 

5! 



H 

(D 
CO 



CD 
CO 



Representation 
Cum. No. ||Docs . 
Cum. Percentage 

• Repres ent dtion 
Cum". No . Do cs . 
Cvim. Percentage 



^Comgound Representations Omitted 
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Since three representations (TA, DI, ST) are composed of other 
representations, the analysis was repeated in Table 16 omitting 
these "compound" representations. Table 16 also^ includes the 
comparable resul ts from Phase II. 



Tables 15 and . 16 present different models different 
orderings of representations. Such models, if consistent, would 
al 1 ow a searcher to know which combinations of fields would be 
most likely to retrieve relevant documents. Such models would 
also'point to obvious economies in the design anp operation of 
retrieval systems. Unfortunately, these dataysuggest that the 
models are not totally consistent. There are di/fferences within 
data bases which depend upon the definition of relevance used 
(most, pel evant versus all relevant), there is also the presence 
of the compound representations in the Phase I study which 
hampers our ability to see a pattern in the other fields, and 
most dramatically, there are differences in th'e orderings between 
Phase I and Phase II differences which could be a function of 
the data , bases themselves (e.g. specificity of terms), or a 
function of how they were constructed (e.g. instructions given 
to indexers) or an interaction between these two. 



There are also some, interesting similarities evident in 
Table 16. Though the models (orderings) differ between Phases, 
they are very similar within Phases. For Phase II the order 
doesn't c-hange as a function of relevance strin^gency, and the 
change for Phase I is both small and less important (involving 
the thirrf and four representations). There are al so s i mi 1 ar i ti es 
in the growth rates within each Phase as evident in the 

curijulative percentages. 



What appeaVs to be highly consistent is the cumulative 
Increase in the percentage of relevant documents accounted for as 
each additional representation is included. This similarity may 
simply be due to the fact'that the models are based on highly 
interrelated data within each phase data are sybsets of one 
another. When the cumulative percentages are plotted against the 
order, th*e resulting curves appear to be hyperbolic in form. The 
next section of this report presents one theoretical 
i nterpretati^on for this finding. 



The overlap among document representations can also be 
viewed from the perspective of a representation's "unique" 
contribution. For a given representation, what documents does it 
contribute to the relevant r^ftrieved that were not retrieved 
under any other representation? The question is equivalent to 
the observed improvements in the models when the representation 
is the last entered into the model. Tables 17 and 18 report the 
effect of each representation, assuming the representation 
entered the model first or last. These are the maximum and 
minmum incremental improvements for each representation. 
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Table 17 



Maximum and Minimxam Contribution of Sevea Representations 



Phase I 







Maximum Contribution* 


Minimum 


Contribution* 




Repr . 


No . Do cs . 


Percent** 


No. Docs 


Percent**. 






266 


.^32 8 


49 


.060 






19 2 


. 237 


44 


.0 54 


c 


DI 


250 . 


. 309 


42 


.052 




II 


TOO 
Z O Z 


TAP 


74 


.091 


> 


ST 


246 


. 304 


44 


.054 




TA 


299 


. 369 


53 


.065 




TT 


231 


.285 




0^4 
























.440 




AA 


488 


. 283 


137 


.080 




DD 


373 


. 216 


127 


.074 


4J 


DI 


462 


. 268 


120 


.070 


C 


II 


527 


. 30 6 


19 6 


.114 




' ST 


485 


.281 


149 


.0 86 


Rel( 


TA 


506 


.294 


134 


.078 


TT 


39 5 


.229 


133 


.077 


rH 












rH 










.579 


<C 













^Maximum contribution is the effect of that representation 
alone — either it is the sole representation in the data 
base or it is used (entered) first, before the others are 
used. Maximum contribution is therefore equivalent to 

'macrO"-/ecall (see Table 8). Minimum' contribution is the 
"unig^ie" effect of that representation after all documents 
retifieved by the other six representations have been 
removed;' thus it can be considered to have entered the 
s earch^ proces s 1 as t . 

^^Percentages are based on all documents retrieved in each 
\category: 810 for the most relevant and 1723 for all^elevant. 



/ 
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Table 18 

Maxiftixim and Minimum Contributions 
of Four Representations 

Phase I and Phase II c 



c 

rd 
> 
0) 

rH 

0) 

o 



Repr , 



AA 

DD 
II 



Maximum Contribution* 
No. Docs, Percent** 



266 
192 
282 
231 



.328 
.237 
. 348 
.285 



Minimum Contribution* 
No. Docs. Percent** 



125 
85 

114 
88 



.154 
10 5 
.141 
. 109 
.509 



AA 

DD 
II 
TT 



310 
339 
229 
210 



,475 
,5 20 
,351 
,322 



112 
158 
42 
50 



, 172 
, 242 
.064 
,077 
. 555 



> 

0) 



0) 

rH 

rH 



AA 

DD 

II 

TT 



AA 
DD 
II 
TT 



488 
373 
527 
395 



. 283 
.216 
.30 6 
.229 



269 
19 7 
271 
182 



\ 



728 
„ 870 
579 
518 



.440 
. 526 
.350 
.313 



2 86 
429 
120 
131 



156 

,114 
,157 
, 10 6- 
533 



173 
, 259 
,072 
,0 79 
,583 



*Maximura contribution is the effect 'of that representation alone— 
either it; is the sole representation in the data base or it was 
used (entered) first, before th"e others are used. Maximum contri- 
bution is therefore equivalent to micro-recall. Csee Table 8). 



Minimum contribution is the "unique" effect of that representation 
after all documents retrieved by the other three representations 
have been removed; thus, it can be considered to have entered the 
search process last. „ ' 

★★Percentages are based on all documents retrieved by all represent- 
ations in each category. For Phase I that number is 810 for roost 
relevant and. 1723 for all relevant. For Phase II the numbers are 
652 for most relevant and 1653 for all relevant. 
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The "unique" effect of each representation is reported as the 
minimum contribution. \^ 

The lack of overlap among representations is again evident 
in the unique percentages. ^ Given .a data base with four 
representations, the fourth representation can contribute a 
sizeable number of additjoniil relevant documents approximately 
25 percent for the Du representation in Phase ^ II, and 
approximately 15 percent for th6 II representation in Phase I. 
Even when the number of document representations is increased to 
seven (see Table^ 17), there is an approximate 10* percent 
contribution of rel-evant documents^ by the seventh representation 
(II in the INSPEC data base). 



One final indicator of the lack of overlap among document 
representations is th6 sum of the unique contributions (Tables 17 
and 18). Consi deri ng ^Phase I and Phase II, these totals range 
from 44 percent to about 58 percent. Thus, the amount of 
overlapping documents range from 42 percent to a high of 56 
percentt " 



The incremental contributions reported in these Tables can 
also be. used to pi^ovide some measure of the effect of' human 
intervention in preparing documents for inclusion in a retrieval 
system. Taylor (in press) writes of the "value-added" process in 
document preparation., Document indexing is believed to add value 
to the document because i\ makes the document more readily 
accessible. -Among the four baV4c representations used in the two 
studies reported here, II and DD require i n tel 1 e^s&tual 
intervention. Between these two . representations-, 'DD can be 
thought of as making more use of intellectual contribution 
because it is based on the human produced thesaurus. As viewed 
from this perspective, the strong showing of both DD and II in 
terms of maximum and minimum contributions provides support for 
i ntel 1 ectual -based representations. T^hough the actual figures 
given in Tables 17 and 18 are useful in this regard, they are 
essentially recalls and a better quantification of value-added 
would combine these wi th, . measures • of precision (e/g. van 
Rijsbergen, 1979-, p. 167). 



C. Theoretical Model of Overlaps 

* •. 

Can the obtained overlap results presented earlier in this 
report be Juuderstood or i n terpreted . i n terms of some theoretical 
^ model? Off the several possible approaches which could be 
• devel oped '^one of the most basic is a probabilistic model based on 
the assumption that relB^vant retrievals are independent in the 
different representations - a plausible assumption given the low 
levels of recall obtained. It is assumed that each 
Q representation retrieves an independent random sample of the 
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relevant documents. Given this conservative assumption, what 
overlaps would be predicted for the different observations and 
how well do these predictions agree with the obtained results? 

Such a derivation of^a model is presented the first part of. 
Appendix H. 'That model is then used to predict asymmetrical 
over.laps. Given - the independence assumptian, asymmetrical 
overlaps being conditional probabilities simplify to the- 
micro-recall value of the second representation (see Appendix H, 
part 2 for a more formal proof). 



The predi.cted values are presented in Table 19. The 
patterns in the two Phases are similar. The model fits the data 
remarkably well, given th.e single, pimple assumption on Which it 
y/as based. The greatest deviations from the model are identified 
by very large or very small values in the (obser/pre) datar (l) 
there are substanti al ly -1 ower than expected overlaps bjgtween AA 
and DD, and (2) substantially higher than expected overlaps / 
between TT and II. In Phase II there is al Si3 a higher 'than 
predicted overlap between free-text abstract terms and identifier 
terms; this finding did not also occur in Phase I. 



The obtained low overlap between AA and DD is not 
surprising, reflecting the contrast between controlled and "free" 
vocabulary. In fact, these two representations are at opposite 
ends of the continuum from least to rtiost cojitrolled: AA, TT, II, 
DD. The high overlaps between titles and M'ndex phrases may 
indicate that titles are well chosen by authors. That is, they 
contain many of the .same key words as an indexer would select. 
The high" overlaps between AA 'and II in Phase II could be a 
function of indexer practice at PsychAbs- -- indexers may not go 
beyond the abstract to. find identifier phrases". Or in the INSPEC 
data base (where the overlap is lower), perhaps the indexers find 
that they need to frequently go beyond the abstract to choose the 
key II phrases. 



This same model, can also* be used to predict the incremental 
effects on recall 'through use of additional representations (as 
in Tables 15 and 16)'. Given four representations, the predicted 
recall using the model earn be determined for a single 
representation, for two represen.tati ons , etc., as shown below. ^ 
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Predicted* and" Obtained Asyinmetrical Overlaps 
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(U 

cn 

(0 
Oh 



II 



DD 



AA 



TT 



AVG 



II 



DD 



AA 



(U 
cn 

(0 

ft TT 



AVG 



II 



DD 



TT 



AVG 



« 

Predicted 

Observed 

Obser/pre 




CI 


. 34^ 
. 365 
.05) 


. 348 
. 361 
(1.04) 


. 34 8 
.424 
(1.22) 


• . 348 
. 383 
(1.10) 


Predicted 

Obseirved 

Obser/pre 


. 237 
.248 

(1.05) 






.237 
.19 2 
(0 .81) 


.237 
.268 
(1.13) 


.237^ 
.236 
"(1.00) 


Predicted 

Observed 

Obser/pre 


. 328 
. 340 
(1.04) 


CO 


.328 
.266 
.81) 




.328 
-. 32^9 
(1.00) 


. 328 
.312 
C0-.95) 


Predicted 

Observed 

Obser/pre 


.285 
.348 
(1.22) 


(1 


.285 
. 323 
.13) 


.285 
.286 
(1.00) 




.285 
. 319 
(1.12) 


Predicted 

Observed 

Obser/pre 


.283 
.312 
CI. 10) 


CO 


.320 
. 318 
.99) 


.290 
."2 80 
(0.97) 


.3>04 
. 34 0 
(1.12) 


. 300 
.312 
(1.04) 



Predicted 

Observed 

Obser/pre 




(1 


. 351 
.378 
.08) 


. 351 
.469 
(1.34) 


.351 
.551 
(1.57) 


(1 


. 351 
.466 
.33) 


Predicted 

Observed 

Obser/pre 


.520 
. 552 
(1.06) 






.520 
.452 
(0 .87) 


.520 
.551 • 
(1.06) 


. (1 


.520 
.518 
.00) 


Predicted 

Observed 

Obser/pre 


. 475 
.616 
(1.30) 


CO 


.475 
.407 
.86) 




.475 
.536 
(1.13) 


(1 


.475 
.520 
.09) 


Predicted 

Observed 

Obser/pre 


. 322 
.491 
(1.52) 


(1 


. 322 
.336 
.04) 


.322 
. 364 
(1.13) 




(1 


.322 
.39 7 
.23) 


Predicted 

Observed 

Obser/pre 


.4 39 
.553 
(1.26) 


CO 


.383 
.374 
.98) 


. 39 8 ^ 
.428 
(1.08) 


.44^ 
.546 
(1.22) 


(1 


.417 
.475 
.14) 



*Based on the model, predicted values are micro-recalls 

0<J 



Representation( s ) 
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Predicted Micro-Recall* 



Any two rSpresentatfons 



Any single representation 




(1-r^) 

(l-rj^)(l-r2) 



Any, three representations 
All four representations 



1 . tlrr^)(l-r2)(l-r3^ 
1 - (l'r^)(l-r2)(l-ir^3m-r4) 



*See .Append! X H, part 1. 



To get the maximal increments as each representation is added^ we 
simply need to order the four representations by their 
micro-recall values from Table 8. The results of applying the 
model- to the Phase I data are presented in Table 20. 



So, at least for the data in Phase I, the model predicts 
quite well. Predictions are not made for the Phase II data 
b'ecause the obtained relative recall is not an accurate enough 
estimate of actual recall there are not a sufficient number of 
relevant documents known to be in the data base beyond those 
retrieved by the four representations. 



The overall conclusion is that overlaps are much as might be 
expected if the representations were selecting relevant documents 
from the" data base at random. The 'iproblem of finding truly 
complementary representa4:ions is largely unsolved, but the 
contrast between abstract words (AA) and descriptors (DD) i? a 
small step in the right direction. • If these results generalized 
to other data bases, then one interpretation is that systems 
should have both controlled and "free" document representatron 
vocabularies. 
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Table 20 

Predicted and Obtained Incremental Iiupxovements 



in Recall - Phase I 



ri 






Micro- 


Combined 


Predicted 


Observed 




Order 


Repr. 


recall 


Repres entations 


Recall 


Recall 


evant 


1st 
2nd 


II 

^ AA 


.348, 
.328 . 


I 


A 


.348 
. 562 


. 349 
. 558 


Rel' 


3rd 


TT 


• .285 


I, 


A, T 


. 687 


. 6 84 


Most 


4th 


DD 


. 237 




A, T, D 


.761 " 


.783 

« 


-p 

c 


1st 


II 


. 306 


I 




. 306 


. 306 


devai 


2nd 


AA 


. 283 


I, 


A 


. 50.2 


.50 5 


m 
Ct5 


3rd 


TT 


. 229 


I, 


A, T 


. 616 


.634 


All 


4th 


DD 


. 216 


I, 


A, T, D 


. 699 


.740 



NOTES: (1) Mi^ro-recall values are taken from Table 8. 

C2) Predicted recall computed from formulas in 
text of report. . 

(3) Observed recall are computed from number of 

relevant documents retrieved (Table 16) divided 
by either 810 or 1723 (Table 15). Observed 
recalls are relative recalls based on seven 
representations. These figures will, therefore, 
overestimate actual recall . 
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Appendix A-i 



PROJECT DESCRIPTION' 



This project will examine the relation between the relevance 
of retrieved -citations an^ the fields that were searched to 
obtain them. Retrieval frora seven different document represent- 
ations will be studied. A representation consists of one or two 
designated search fieldiS. 

The data base for the study is Computer and Control Abstracts 
(a subfile qf INSPEC) . 'The system you will use is a local 
simulator of DIALOG, mounted on the S.U. computer. Almost all 
DIALOG features are available for you to use, but some .Restrictions 
v/ill be made to achieve the study objectives. 

The objectives of the study require you to conduct hi^h 
recall searches, but v/ith a limit of no more than 50 citations 
per query . , * 

' In all, you will be asked to search 98 queries. Over the- 
course of the study, you will use all seven representations, but 
for e^ch query only one representation will be assigned. 

For each query, you, v/ill be asked to search from a request 
form; the statement of the ^query was prepared by a real user who 
will receive the output. The request form will also prescribe 
^he representation you are to use. The unique pass\»7ord assigned 
to"" -the , request will automatically "lock" the search so that you 
can only search on the designated parts of the citations. 



After you have completed each search (including the 
essential print command), return the search request form and^ 
a copy of your interaction vrith the system to Brian HcLaughlin. 



(5/2/80) 
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Phase I 

Appendix A- 2 

DATA BASE 

Computers and Control Abstracts is that portion of the IKSPEC Data 
Base dealing with all areas of computing and information science. 
The specific data base that vrill be searched in this study consists 
of four months (Sept. - Dec. 1979) of Computer and 'Control Abstracts 



The citations you will retreive will be organized as follov;s: 

DKnumber (abstract numbers from I13SPEC journals) 
Title 

Authors (separated by commas)* 

Source fields as follows 

Publication: (volume and issue number) (part nuinb^r) 

pagination data 
Follov;ing this may be information in C -I • This is 
information on the cover-to-cover translation as 
^ follows: C pvtblication; (volume and issue) pages 

date3 (type of unconventional media) (availability) 
(Title of conference), (location o:f conference)? 
(sponsoring organization) (date) ' language 

Abstract 

Indexing information 

V • 

NOT all the citations will contain each of these items of information. 



Phase I 

DIALOG - SIMULATOR DIFFERENCES 

The DIALOG simulator you will be using to conduct the searches is 
almost identical tQ "regular" DIALOG. In general, searching should 
be performed in the same way as any DIALOG search. 

The* restrictions, cautions and limitlTtions are noted *below. 

1. Each new query, you search must b,e started with the full 
. BEGIN. '* - . 

2. To restrict a search to a particular language, use a 
Limit /ENG (for English) , or v/hatever language you V7ish. 

3. Adjacency (nW) cannot be used with either truncation or 
stemming. 

4. Adjacency may run very slow; the field operator (F) can 
be used instead. 



(5/2/80) 
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Phase I 

THE REPRESENTATIONS Appendix A- 3 

You Will be using seven different representations during the 
-study. A representation names the one or two fields of the citation 
to which your search must be restricted. You will search on only 
one repr^entation for any given query. The representation you 
are supposed to searcti on will be designated on the request form 
v/e give to you. A unique password will be given with each request 
and this passv/ord will automatically lock the search onto the 
assigned representation. 

The seven representations and the fields they will search 
are as follows: 

TT - will,, search terms in title only. 
AA - will search terms in abstract only. 

DD - will' search descriptor tenr.s only. A thesaurus will, 
be provided to you for use with* this controlled 
vocabulary representation. (The thesaurus may only 
be used on^this project). 

II - will search identifier terms only. 

TA - v/ill search terms in title and abstract only. 

ST - will search stemmed terms in title and abstract only. 

The computer will automatically take the logical root 
of any entered term. Truncation cannot be used with 
this representation. 

DI - will search terms in descriptor and identifier fields. 
The thesaurus will be provided for use with this 
controlled vocabulary representation.' 

One representation with V7hich you may b^ unfamiliar is 
stemming (ST), which will be used with title and abstract words 
only. ? stemmed term is a word that has been shortened by the 
computer to i€s logical root. This is similar to truncation in 
that the stem LIBRAE would retrieve LIBRARY, LIBRARIES, 
LIBRARIAN, etc. For truncation howevet, the root is determined 
by the searcher. For example, if you entered LIBRARY under the 
ST representation, the computer would automatically be reduced 
to its logical root and LIBRARY, LIBRARIES, LIBRARIA-N, LIBRARIANS, 
etc. would all be retrieved. 

Truncation is not to be used with the stemming representation. 
In fact, the simulator will reject any attempts to use truncation 
in this representation. 
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Phase I , 

milE: DATE: 



SCKOOL ADDRESS; [ ^PHONE : 

HOME ADCRE^SS: PEONE s 



We v/ould like a description of your topic of interest. This 
statement should be clear enough so that any person who also knows 
about this topic v;ould, on the basis of this statement alone ^ be 
able to pick out citations of interest for you. 

Please write your description here; 
k^cc^ty^t'hlokt . I cfo K^'^ loc^Hjh Ci'fcJf'iOKS ^c^'f cfe<k.( o*^/cj 



Given your purposes in requesting this search, how many citations 
do you want? 

' ^ About how many citations on your topic do you expect to receive 

from this computer search?_ 

YOU MAY FOLD THIS REQUEST FORM IN THIRDS. STAPLE SECURELY, A^D 
O DROP IN CAMPUS MAIL, 4/4/80 < 

ERIC ^0 
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Phase I I 



NWIE : J PATE : _ 

SCKOOL ADDRES S ; _PHONE : 

HOJIE ADDRESS: ^ PEONE:_ 



We v;o,uld like a description of your topic of interest. This 
statcmeHt^ould be clear enough so that any person who also knows 
about this topic v;ould, on the basis of this statement alone, be 
able to pick out citations of interest for you. 

Please write your description here; 

f^^ A-opc »P \v^^■i.re•sV' ;n.v^>vg< M.a4-iowal ;w4-€<^i^a-f'o*^ai 

^ I toOL>((^ l;k€ ivc^o^t^aV.'oK. abou^- !te U-ouj H-c [Doli4-«'c>l .., 
policies af4^gc4- ba^e u^ay . 3|>[^*' ca4-;oH.s. 

i. e. vo. i>t^ y v-o> 4:0 .^aM.ayv>c4 >ic--f •.M.^PovK>^^+i qvv ^tyf-c^s 



Given your purposes in requesting this search, how many citations 
do you want?_ , • 

About how many citations on your topic do you expect to receive 
from this computer search?^ 

YOU tlAY FQllD THIS REQUEST /ORM IN THIRDS. STAPLE SECURELY, AJJO 
DROP IN CA14PUS MAIL. - g 4/4/80 
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•Phase II 
SEARCHER INFORMATION 

PROJECT DESCRIPTION: 

This project will examine the relation between the relevance of 
retrieved citations and the fields or representations that were 
searched to obtain them. The database for the study is a portion of 
Psychological Abstracts, Searchers will be asked to search each query 
foxor times - once under each of the four representations. 

REj>RESENTATIOHS: 

A representation names the field of the citation to which a 
search must be restricted. The^ four representations to^ be used for 
each query by each searcher are: 

1) TT - search terms in TITLE only. 

2) AA - search terms in ABSTRACT only. 

3) DD - search terms in DESCRIPTORS only. 
4} II - search terms in IDENTIFIERS only. 

DATA BASE DESCRIPTION; 

The database consists of journal articles written in English 
from Psychological Abstracts (PA) published during six months (July- 
December 19 80). This file contains both clinical and research aspects 
of psychology and includes subjects such as cognitive processes, 
educational psychology, psychometrics and statistics, and guidance 
and Counseling. ^ 

PA citations printed on-line exhibit the following categories of 
information, when available; 

Document Number 
Title 
Author 
Source 

Section Code 
Abstract 
Descriptors 
Identifiers 

SYSTEM FEATURES:: 

You will be using DIA.TOM, a system mounted on the S.U. computer 
which is a local simulator of DIALOG, and almost identical to it. 
Some of the major features you will probably make use of are- 

— Select or Select Steps/ 

— Boolean operators with a Select or Combine statement. 
— - Full text , operators, (W) , (NV7) , (F) , (C) . 

— Truncation with any operator (boolean or full text) . 
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SEARCHER INFORMATION, Page 2 , Phase II 



Refer to the DIATOMrDIALOG simulatox handout which lists all the 
possible commands. Use only those which are in both systems. 

Although a stemmer and some other "automatic" features are 
available on DIATa4, do not use them as DIALOG. does not have them. 

SEARCH PBOCEDURE : 

Each searcher will search on 40 to 60 queries. Four searches 
will 'have to be conducted by each searcher for each query, one foj 
each of_J:ke four representations. The four representations must be 
searched tin a pre-specif ied order.. . , , , 

Xaiir job as searcher is to prepaire and conduct high recall 
searches. / 

For each search you will be given a request form. The query will 
be prepared by a real user who will re.ceive the output. You will 
receive no information regarding the user's request other than what 
is designated on the request form. This fojrm will- also have the 
order of the representations to be used designated on it. 

You are to pick up the search requests on Mondays and Thursdays, 
and return the completed searches by the Monday or Thursday that 
follows. You will have 2-3 days to .complete each search. 

You may perform the search on any terminal that is or can be 
connected to S.U., that is convenient to you, as long as a hard copy 
can be printed. 

Here it is important to note that each search on a query, should 
be started with a BEGIN command (which together with the query number 
■ and searcher password) locks the search to a particular representation. 
The next BEGIN command for the same query locks it to a different 
representation according to a pre-assigned order of representations. - 
This way the order of representations to be used cannot be changed. 
You will be given a thesaurus for controlled vocabulary searching. 

rJhen you have completed a search, use a PRINT command with 
Forriiat 1, to get the document numbers of the retrieved set. If no 
documents have been retrieved, type in NOTHING FOUND and print out 
any one document with FORMAT 1. 

Retujrn 1) the search request packet, filling in the 
needed information and 

2) a copy of your interaction with the system 
to Brian McLaughlin. 
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Phase II 

DIATOM - a DIALOG Simulator 

DIATOM {Dialog Implementation - Au/|mented To Overcome Magic) vas 
implemented at Syracuse Oniversity by Bobert Haldstein as both a 
teaching device and a research tool- It incorporates most of the 
features of DIALOG and has a few additional features- The comparison 
in the following description is accurate as of Hay 1980- 

Command Summary 

BEGINn, Bn, !n 

To start a search in file n- Erases work done to that point; 
restarts set numbers at 1- 
Examples: BEGTNn; Bl ; ! 1 

BEGIN . ^ , V 1- 

Equivalent to BEGINn but includes a routine for labeling the 

search- 
Examples: BEGIN 

BEGIN BYPASS, BB, !B 

This command is the equivalent of BEGIN1- 
Exampies: BEGIN BYPASS; ! B; BB 

EXP AND E 

To display a part of an index- May be used with words, "prefix 
codes, or online thesaurus- 

Examples: EXPAND AST; ELIBRABY; EAD=Waldst ein, B?; E Bl _ 
r Simulator difference: Only one expand list exists at a time- 

I-e- you can't have both an E and B list at the same time- 

r EXPAND (word) 

To display subject related terms from a thesaurus- 
Exampl^: EXPAND (ENEBGI) ; E (BEADING) 

( 

* S EL ECT S 

To request postings to be retrieved from the index. May be used 
/ vith words, prefix codes, or EXPAND numbers- 

EXAMPLES: S MIRAGE; SAD=BOB; SEl,E4-E7; SB2,R4-R6,a9 

SELECT can also be used with boolean operators- In that case it 
( Selects a fall boolean set description; with AND, OR, NOT, and 

parentheses operators- Note that boolean hierarchy is used in 

the following ordex: () , NOT, AND, OR- Set numbers may be used 
C as an item, e.g- S DOG AND Si; S DOG AND #1- ' 

Examples: SELECT DOGS AND CATS; S pOG/DE,AB OB E3 

S (AU=BOB OB JO=JASIS) NOT E1-E5 
C Simulator difference: DIALOG always creates the sets in the 

order given- E-g- 

S DOG AND LIBBAB? NOT B2,B5 
( 150 DOG 

2053 LIBBAB? 
12 P2,E5 

( 1 35 DOG AND LIBBAB? NOT B2,B5 

The simulator may create .the sets in a different order for 
internal optimization reasons- 
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SELECT STEPS, SS, S STEPS ' 

Equivalent of SELECT with boolean operators except that each term 
results in a numbered set« For example: 
SS DOG AND LIBBAB? NOT B2,B5 

1 150 DOG 

2 2053 LIBRAB? ^ 

3 12 R2,B5 

« 35 DOG AND LIBBAR? NOT R2,B5 

SELECT [ word] " , . 

SELECTS the thosaural entries foil this word- It selects all 
entries except BTs (related terms) and BTs (broader terms). 
/ Examples: SEf^ECT [ENEBGY]; S [BEADING] 

Simulator difference: DIALOG has no comparable capability- 

COMBINE, C , ^ « 

Used with boolean operators AND. OB, NOT to relate sets, May 

only be used with set numbers. _ 

Examples: COMBINE ! AND 2; C6-8/0H; C «» AND (5 OB 6); C7-U 

TYP^ T 

*To type record (^) online at a "terminal. Osed with either set 
■ numbers or DIALOG accession numbers: set/format/ranqe. Formats 
1-8 are used. ' 
Examples: TYPE 10; T12/2/1-6; TDN1023 

DISPLAY, ,D 

Displays a record online- Same as TYPE. 
Examples: DISPLAY 10/3/2-4,7; D DN312 

PRXNT ^ 
To request offline prints. Osed with either set numbers or 

DIALOG accession numbers. 

Examples: PBINT 7/5/1-49 . k„ 

Simulator difference: A print creates a set on disk named by the 
password used at LOGON. It is of the form 

<1st 6 chars of password>.<last 2 chars>. 
To qet an offline print once the simulator is left then use 
monitor PBINT command. 

TO cancel the previous print ^ request. Must be used before 
LOGOFF, BEGIN, .FILE, or ENJD commands. 
Exam|>les: PRINT - 

Gives time elapsed and cost estimates since last BEGIH.or END or 
file chanqe. Does not interfere with search strategy.. Starts 
new costing. 
Examples: END 

.COST 

Gives the elapsed time and cost estimate since last BFGIH. Does 
not interfere with search strategy. . 
Examples: .COST 
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DISPLAY SETS, DS • • \ 

To display all sets made siiice previous BEGIN. Used for a t-ecap 
p^,f search strat^y used, 

Examples: DISPLAY S£TS; DS ^ ' 

D-ISPLAY SETS n-m^x^y-z 

Used' to di;splay a certain set of the created sets. 
Examples: DISPLAY SETS 15^18,26; DS 3 

Simulator difference: This capacity is a little broader than 
DIALOG. 

EXPL^AIM, 1 . ^ 

To request online explanations of command and file features. 
Examples: EXPLAIN FILEl; ?NEWS; 7NEGDIC 

.FliEn . - " 

To change to another file. Use not recommended on either DIALOG 
or the simulator. 
Examples: - FILE 1 

FEEDBACK, F 

This enables the user to db feedback on a known relevant 
document. Feedback can be done on four fields: title, abstract, 
descriptors, and identifiers. For the title and abstract the 
terms fedback on are those separated by spaces while for the 
descriptors and identifiers the terras separated by semicolons are 
those fedback on. Fir this reason it will not work to combine 
free and controlled representation feedback. Note feedback ^ can 
also be done on major fields (e.g. DE*)* The default field is 
the title. There are 3 different types of feedbatck available: 

FEEDBACK 1 - This type of FEEDBACK OBs all the terms of the 
desired field (s). Note that this is the default. 

FEEDBACK 2 - This type of FEEDBACK ANDs all the tetms of the 
desired field (s). Note that usually this will give no documents. 

FEEDBACK 3 This FEEDBACK uses the ERIC thesaurus. Note 
that it is therefore meaningful only ovt the descriptor field. 
Examples : FEEDBACK2 DN123VTI; F DN5/ID*; F3 DN25U3/DE; F 
DN3U56 

Simulator difference: No equivalent feature in DIALOG. 



NATURAL, H ^ y 

. Does a search on the words of a natural l^guage request. Takes 

the words of the command string and ORs their stems together. 
Examples: NATURAL 'THE USE OF INFORMATION RETRIEVAL SYSTEMS 
Simulator differences: No equivalent feature in DIALOG >^ 

NATURAL RANK, NB 

Does a search as in NATURAL but unstemmed and ranks the retrieved 
documents by inverse dt)cument frequencies. Important note: the 
sets created by this command can not be combined with other sets! 
Note that format 12 gives the rank weights of the retrieved 
documents. 

Example: NR THE USE OF INFORMATICS RETRIEVAL SYSTEMS 
Simulator differences: no equivalent features in DIALOG. 




Phase II 



Appendix A- 11 Page 58 

'Paqe 4 



LIMIT, L . ^ ^ w-,-. 

To restrict SELECTed set to specified requirements. Ca-pabUity 

varies by file. , 
Examples: LIMIT S/MAJ; L2/MIN; L 8/?liJ,HIN; L3/TI,AB 
Simulator difference: DIALOG does not permit LIMITing by field, 
DIATOM does. In general, DIALOG has more LIMITS per file thdn 
DIATOM. Check file documentation for details. 

LIMIT ALL, LALL , 

Osed before SELECTing sets to limit all subsequent SbLECTmq to 

specified requirements. Capability varies by file- 

Ex,amples: LIMIT ALL/MAJ; LALL/STEM; LALL/DE,ID*,TI 

Simulator difference: The simulator can't limit by accession 

nutaber. However, DIALO* can't limit by stem oc by field. 

LIMIT ALL/ALL 

To cancel a LIMIT ALL command 
Examples: LIMIT ALL/ALL; LALL/ALL 

PAGE, P 

To request another soreen (or page) of display after an EXPAND 
Examples: PAGE; P 

/ 

LOGOFF / ^ 1 , .. 

To signoff and disconnect from DIALOG or the simulator. 
Automatically gives cost estimate of connect time. 
Examples: LOGOFF / , 

Simulator comments: The pause that/ sometimes occurs during 
logoff is caused by two processes: atl THP files created by the 
user are deleted and all PRINT commands are executed. 

search Save Commands: END/SAVE, .EXECUTE, ..RELEASE, .RECALL 

Simulator difference: None of these are implemented on the 
simulator. Note however that they all give appropriate messages 
when their use is attempted- 
Search features 

Truncation - ? (question mark) 

There are four capabilities in truncationt 

1) Unlimited number of characters after the stem. 

SELECT EMPLOY? 

2) Specified maximum number of characters after the stem. 

SELECT HOHSE? ? 
SELECT THEAT?? ? 

3) Embedded variable character 

SELECT »OM?H 
SELECT ADVERTI7E 

4) Combination of the above. 

SELECT HORKH?H? 

Stemming - » (ampersand) There are two capacities in stemming. 

1) SELECT all words with same stem. 

SELECT LIBBARIANa 

2) In combination with internal truncation. ^ 
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SELECT WOM?Na 

Simulator difference: No comparable feature in DIALOG^ 

Basic index field indicators . ^ 

Suffix symbols; used to specify searching on. field fs) which make ^ 
up the basic index. Fields vary per database, 

. ./AS Abstract 

••/DE, ..•/DE* Descriptors 

.•/DF, •••/DF* Full descriptors (single word) 

, . /ID, • . ./ID* Identifiers 

../IF, .../IF* Full identifiers (single word) 

../TI Title 
♦ indicate MAJOR n 

1) SELECTing single terms: ^ . 
SELECT BUDGETS/TI 

2) Specifying more than one field: 
SELECT TENSION/TI,DB,ID 

3) With full text operators: 
SELECT POP (W) TOP(F) CANS/TI,AB 

Additional inde^xes 

Always used with two-letter prefix code. Prefixes vary per 

database. ^ 

AU= Authoi 

JO- Journal 

Full text operators 

Dsed only with SELECT comirfand- 
(W) To reguest a word immediately adjacent to another in the 
given segue nee. ^ 
Example; S SOLAR (W) ENEBQ^T 

(nW)/v To request a word within n words of another in the given 
order. 

Example: S SOL AR (3W) ENEHGT 

"(F) To reguest a word in the same field as another; in an 
order in any field. 
Example: S SOLAR (F) Elf ERGY 

(C) To reguest a -roxd in the same citation as another; in any 
order. Note that this ^ the same as AND. 
Example: S SOLAR (C) ENERGY 

Simulator difference: The simulator does not recognize (L) or 

Simulator comment: Adjacency searching (H) is very slow. E.g 
S INFORMATION (W) RETRIEVAL may take around 3 minates« 

Full text operators used with truncation or stemming 

A recent aj^ition jto DiSlOG is the ability to . use ' full field 
features in conjunction with stemmed or truncation features- 
Examples: S LIBRAS??? ? (F) AOTOMAT???? ?; S w6m?N (F) SOCIETY^ 
Simulator difference: The simulator/ cannot^ use internal 
truncation when adjacency is used. E.g. S HOn?N (W) HISTORY will 
not work. ,Note that simulator will give an uni mplemented DIALOG 
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Ifanqe searching 

•Simulator djlfference: The simulator does not recognize range 
searching requests. • \ 



Using Boolean terms 

Apostrophies (•) may be ysed to select a term^ with a boolean 
operator- 
Example: S •ABMY AND NAVY* 

Simulator difference: The simulator works sligh^iy more 
generally than DIALOG. The difference will not be app^^rent in 
normal use- However, DIALOG improperly handles : ' 

S CAN'T AND WON'T ' • , . * " . 

while the simulator hand|,es it corre^ly. 

Command entry and~ output features ^ 

Stacking • 

Use a semicolon (;) to seperate a series of commands to be 

executed with one Carriage return. 

Example: S E1-E3;S AU=BOB;L 2/MAJ; C 1 AND 3 

BREAK 

Use the break key to stop output and stop ^ execution of present 
command 

Example: T 1/5/1-UOO [BBEAK] , . 

'Simulator diflerenceS Unfortunately this doesn't work till tl^e 
DEC clears its outptit buffer of approximately 150 characters. 
<cntl 0> will stop output immediately. Note that <cntlO> does 
not .stop execution and it is important to Kit [break] as well. 



Backspace and erase 

Use <CAtl H> or < backspace 1cey> or <delete> to 
characters typed before carriage return- 



erase last 



Erasing a line ' ' , 

Use <escape> key followed by the <return> key, 
ignore the line and 'give another prompt. , . 



The, system wil;l 



Width control at logon > 

When giving-your. a character' password a terminal width may be 
specified. This can irang^ from ^30 to 115- Ju^t follow the 
passwor^P with "Hnnn" where nnn is the desired width- 

Output Control ' , 



Format Options 

The following options are available and 
TYPE, DISPLAY, or PRINT commands. 
Format 1 DIALOG accession number 
Format 2 - Full Becord except abstract 
Format 3 - Bibliographic citation 
Format * - Abstract and title ^ \ 

Format 5 Full record 



may 



be used with thfs 
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Format 6 - Title and accession number - 

Format 7 * Bibliographic citation an4 abstract 

Format 8 - Title and indexinq 

> 

TYPE set t/format i/range . . 

If no range is given default^ to the first citation. It no 
formatMs given defaults tp 5. 
DISPLAY set #/Format »/range 

Same as for TYPE 
PRINT set ^/Format .#/range ^ ' ' ; 

( Same as for TYPE 

Files . . 

( Presently there are six files in the system. ^ 

EBIC - File 1 ^ ' \ ^ u X*. 

{ ' This file consists ot 8,573 citations from the EBIC database. It 

contains all the BIE and CUE documents for four cleaxinghousest 
IB, EA# TM, and TE from 1980. Note this was a transition year 
forVfiU ERIC thesaurus. 

Suffixes: AB,TI,,DE, DE*, DP, DF*, ID,ID*,IF,IF* 
prefixes: J0=, aD=,.CH=, DT= 

Limits: ' MIN, HAJ, ED, EJ , . . 

CUE-- File 2 a ^ 

This file consists of 10,885 citations from the ERIC database. 
These are all from current index to journals in education (EJ 
niimbers) from f ojir clearinghouses: , IB, EA, TM, and TE from 

( 197U-1978. u 

suffixes: • ■aB,TI,DE,DE*, DF,DF*, ID, ID*, IF, IF* 

Prefixes: J0=, A0= . 
Limits: MIN, MAJ 
f - ' DN numbers are used in place of ED or EJ numbers. 

( INSPEC — file f3 

^this file consists of 12,864 documents- which is th6 last 4 months 
, of the 1979 Computer and Contro^file. 
( •' . Suffixeis:' AB,T1, DE,DF,ID,IF ' ' - ;, k. 

Note that the ID fields are the free text terms assigned by 

INSPEC indexers. 
( Prefixes: JO=,AO= . 

Limits: FBN, ENG . 
DN numbers are used for'internal access.. ;> 

^ OSP - file f% This file consists ,of the resear/:h beijpq conducted 
presebtly. at Syracuse University.. It is produced by the Office 
. ( of Sponsered Programs under Bill Silson. It is (presumably) 

being continually upda.ted. 
. . Suffixes: TI, AB, DE, DF ' . . 4. /on-^ 

. ( ' Prefixes: Sponsor'^ Name (SN=), Project Director (PD-) , 

Department Name (DN=) , 

^ LRA^ — File #5 

This file dontains bibliographic citations for books, reports, 
dissertations, and other items of importance to' the Local Revenue 



ERIC 



ill . ^ 



Phase II 



Appendix A-15 Page 6 2 

Paqc 8 



Administration Project, funded by O.S. Agency for International 
Development through Syracuse University Maxwell * School, the 
project is directed by D. Glynn Cochraine. 

suffixes: TI, AB., ID, IP, DE, DP, GE, GB (Geographical reqion) 
Prefixes: Author (AU=) , Af filiation ' (AF=) , Source (S0=) , Date of 
Publication (PD-) ,^ Document Type (DT=) , 'Contract Number (CN=) , 
Historipal Period (HP=) ^ CAll number (CA=) 
Limits: ENG, FRN, MAP, BIB (Bibliography), TAB (Table) 

PSYABS File 6 

This file consists of 11,662 citations from the Psychological 
Abstracts database. It consists of all documents from issue 6U 
with a DT (document type) of journal. 
Suffixes: TI, AB, DE,DF,IF - 
Prefixes: AO^, JO=, SH== 

Simulator file limitations 

Thesaurus 

There are no BT entries in the main inverted file. However, 
descriptors are listed with a ? in the related term column 
during an EXPAND. These items can have a thesaural expansion 
done by doing an E E9 (id the case where E9 has a ? in the BT 
column). Also no posting information is included in the 
thesaurus EXPANDS. 

Other simulator features for the head honcho 

EXPLAIN files • 

When any file is created under the main PPH (e.g r3U3U,12]) or 
the PPN from which the simulator is being executed with a DIA 
extension it is accessable from the simulator using an EXPLAIN 
command-^^ E.g. if a file is created called BOB. DIA then ?BOB 
will type out this file on-line. If a file called LOGON.^DIA is 
created it is printed whenever anyone logs on. 

Required passwords 

When a file calle^Kpasswd. DIA is created in the acic^ount from 
which the simulator is being executed then only the passwords in 
that file can use the simulaJ:or. A form of an entry in this file 

is: a - 

<8 letter passwordXspaceXf ile numberXspaceXLALL Command> 
The file number and the LALL command are both optional. An 
example entry is 

^WALDSTEI 1 /STEM 
will cause a person using password IfALDSTEI to logon into file, 1 
with a LIRIT ALL to STEM. ^ 

ERIC file size on ^he DEC lO, 

Thessize needed for storage of the ERIC file in blocks (128 aECIO 

words) is as follows: . % " ' 

ERIC. DAT - document file 12720 blocks 

ERIC.INY - main inverted file 5369 blocks 
ERIC. JO - journal inverted file 79 blocks 

jSRIC.CH - Clearing house file "J9 blocks 
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EHIC-DT - Document type file 59 blocks 

EBIC.AU - author inverted file 429 blocks 
EBIC.BIG - inverted file of terms with >1100 postings 

691 blocks 

.EBIC-THE - EBIC thesaurus 2338 blocks 

CUE file size on the DEC 10 ' • 

The size needed for storage of the CUE' file in blocks (128 DECIO 
words) is as follows: 

CIJE-D&T " - document file 8U67' blocks 

CIJE.INV - main inverted file 3749 blocks 

CUE- JO - journal inverted file 119 blocks 
.CUE. AO - author inverted .file 539 blocks 
CUE-BIG - inverted file of terms with >1 100 postings 

326 blocks 

CUE-THE - CUE thesaurus ' 2066. blocks 

An indeterminate amount of sgace can be used by the EXPLAIN 
commands as described .above- j ^ 
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13063 

3-nethoxy^a^hydi:oxyphenylglycal excretion in acutely 
schizophrenic patients during a controlled clinical trial of the 
isomers of f lupenthixol. 

Joseph^ H. Baker^ H. ; Johnstone^ Eve C. ; Crow^ T. J* < 
Psychopharmacology. 1979 Vol 6U(1) 35-UO 
SECTION CODE:33U0; 2520 

LBSTBACT: Urinary 3-- methoxy-U--hydroxyphenyrglycol (MHPG) excretion in U5 
acute schizophrenics was studied before and during a trial of 
the isomers of flupenthixol and placebo- Pretrial MHPG excretion 
was not related to severity of illness fiefore the trial or to 
other pretria-l clinical variables. In piale Ss^ higher pretrial 
MHPG excretion was associated with a better outcome 1 yr 
posttrial- However^ in females^ no relationship between MHPG 
excretion and outcome was established- During the trial there 
was a reduction in MHPG excretion in Ss^treated with beta-- 
flupenthixol but no decrease in the group treated with alpha- 
flupenthixol or chlorpromazine- In Ss on placebo, there was a 
reduction in I^HPG excretion in -those who did well clinically but 
not in thx^s'e who did poorly* Thus lov MHPG excretion may be a 
predictor of poor outcome in schizophrenia, but MHPG. excretion 
also changes as a function of clinical state and neuroleptic 
drug administration- (35 ref) 
[)ESCBIPTORS:OBINATION; NOBEFINEPHBINE; METABOLITES; ACUTE SCHIZOPHBBNIA; 

NEUBOLEPTIC DBUGS; HUMAN SEX DIFFEBENCES; DBUG THET&APY; 
NEUBOCHEMISIBY; PREDICTION 
[DENTIFIEBS: isomers of flupentixol^ urinary excretion of 3^methoxy-U'~ 
hydroxyphenylglycol 6 relationship of metabolite levels to 
clinical variables € prediction of drug response^ male vs female 
acute schizophrenics 
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13029 

.Treatment of severe dog phobia in childhood by flooding: A case 
report- 

Sreenivasan# Oma; Manocha, S- N- ; Jain, V- K- 

' Journal of Child Psychology & Psychiatry 6 Allied Disciplines- 
1979 Jul Vol 20(3) 255-260 
SECTION CCUE:3330 ^' . ^ ^ 

ABSTRACT: An*11-yr-old girl with a 5-yr history of severe phobia of dogs 
was treated with flooding after deseitsit ization failed- 19 mo 
' ! after flooding the 5 was free of the phobia and symptoms, of a 

\ ' tension state- (10 ref) 
DESCRi)>TOBS:,IKPLOSIV'e IHEBAPY; PBOBIAS; SCHOOL AGE CHILDBEN; CASE REPOBT; 

. HUMAN FEMApLES ' 

IDENTIFIEB^ :f loodihg treatment^ treatment of dog phobia^ 1 1 yr old female 
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Utilized consumer-descriptive and behavioral-descriptive data to 
examine the factors that influence overall magazine readership 
levels within a sample of US men and women (2,819 women and 3# 
186 men). Over 70% of the total variance in readership could be 
predicted with a combination of demographic, psychoqraphic , 
media-usage, TV-program-choice, and magazine-choice variables. 
Psychographic dimensions were more important predictors for 
women than men, and IV- program factors were more important for 
men than women- These patteriis may develop from the 
(generalized) differences in the uses of media for each sex, or 
from sexually based differences in how individuals perceive the 
gratifications available from this different media. Further 
research would be necessary to confirm the suspicion, the author 
notes, but coagruity of IV and magazine preference patterns 
could be expetted more frequently where psychographically 
related functions of the media (for ""other-directedness* • ) were 
weaker. Men may perceive TV and magazines as similar media (for 
relaxation, perhaps), whereas women's use oS these print^acd 
broadcast media 'differs an tf^ therefore their selection patterns 
differ- It is also noted t)vat the pattern of demographic and, . 
psychographic predictors confirms previous findings on the 
positive relationship between higher socioeconomic 
characteristics and higher magazine readership- (US ref) 



35 patients (mean age 3U-7 yrs) with premenstrual syndrome 
recorded their symptoms daily using the Hoos Menstrual Distress 
Questionnaire- These were analyzed by a least mean square method 
of fitting sine waves- After recordinq an untreated cycle, Ss 
were qiven proqesterone"- (200 mq) and placebo in a double-blind 
crossover manner; 75«W the Ss were then qiven progesterone 
(UOO mq) and placebo in a similar manner- Treated cycles were 
rated by both daily menstrual distress questionnaires and 
retrospective self-assessment- Both ratinq methods showed there 
was no siqnificant difference between proqesterone and placebo 
in reducinq symptoms of premenstrual syndrome, and xn the 
majority of cases placebo was more effective, althouqh never 
siqnificantly so- (13 ref) 



In a replication of a study by H- Garland and K- H- Price (see 
PA, Vol 61:1020), 1U3 male and 83 female advanced university 
business students read descriptions of the success or failure of 
a fictional female manaqer in the 1st yr of her job, completed 
the Women as Manaqers Scale (WAMS), and rated 4 possible causes 
for the manaqer's success or failure (ability, effort, luck, or 
nature of job). Garland and Price's finding that WAMS scores 
Here not affected by success or failure descriptions was 
replicated for both male and female Ss, and additional data show 
that males and females tended to attribute success and failure 
to similar factors- (10 ref) 
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TI|:lE: Psychophysiological investigations of post lunch state in male 

and female subjects. 
XUTHOB: Christie, Margaret J.; McBrearty, Eileen M. 

DESCRIPTOBS:FOOD INTAKI; HUMAN BIOLOGICAL RHYTHMS; METABOLISM; EMOTIONAL 

STATES; PS YCHOPHYSIOLOGY ; PERFORMANCE; PARASYMPATHETIC NERVOUS 
SYSTEM; HUMAN SEX DIFFERENCES; BODY TEMPERATURE 

lDENTIFIERS:l unch, diurnal variation in blood glucose 6 autonomic factors 6 
body tempe'rat uDe C mood 6 performance efficiency, male vs female 
Ss, implications for parasympathetic involvement in deactivated 
mood .■ ^ 

TITLE: • A developmental a ttributional analysis of sex role st-reotypes 

for sport performance. 
AUTHOB: Bird, Anno M. ; Williams, Jean fl. 

DESCRIPTORS:SCHOOL AGE CHILDREN; ADOLESCENTS; AGE DIFFERENCES; STEREOTYPED 
ATTITUDES; SE.X ROLE ATTITUDES; SPORTS; ATTRIBUTION 

IDENTIFIERS.-age 6 sex of athlete S outcome 6 sport, attributions of ability 
vs lucJc to sports performances C sex role stereotypes, male & 
female 7-9 vs 10-12 vs 13-15 vs 16-18 yr old Ss 



TITLE: Human social attitudes affected by androstenol 

AUTHOR: Kirk-Smith, Michael; Booth, D. A.; Carroll, D- ; Davies P 

DESCRIPTORS:HUflAN SEX DIFFERENCES; SOCIAL PERCEPTION; EMOtJoNAL RESPONSES- 
EMOTIONAL STATES; ANDROGENS; DRUG EFFECTS 
rDENTIFIERS:androstenol, mood 6 personality ratings of people in photographs, 
. male vs female Ss f » 

l!-^^^^* conceptions of children's cognitive abilities. 
AUTHOR: Miller, Scott A.; White, Nancy; Delgado, Maria 

DESCRIPTORS:COGNITIVE ABILITY; COGNITIVE DEVELOPMENT; HUMAN SEX DIFFERENCES- 
^n^??^*/^*^^*^^*'' TASKS; ADULTS; DEVELOPMENTAL DIFFERENCES; - 

IDENTIFlERSivariou^ Piagetian cognitive ability tasks 6 type of question 

asked of adults, adult conceptions of children's abilities, male 
vs female fi parent vs nonparent Ss 

TITLE: Perf oraance-self-esteea and dominance behavior in mixed-sex 

dyads. 

AUTHOR: Stake, Jayne £• ; Stake, Michael N. 

DESCRIPTORSlHUMAN SEX DIFFERENCES; SELF ESTEEM; PERFCRMANCE; SEX ROLES; 

DOMINANCE/; GROUP DECISION MAKING; DYADS; INTERPERSONAL 
INFLUENCES 

IDENTIFIERSidecision aaking dominance in mixed sex dyads 6 performance self 
esteem, male 6 female Ss 



TITLE:/ . Crowding, contagion, and laughter* 
A.UTHOR: Freedman, Jonathan L.; Perlick, Deborah 

DESCRIPTORS:CROHDING; LAUGHTER; INTERPERSONAL INFLUENCES; GROUP DYNAMICS 
IDENTIFIERSilow Vs high density croifding conditions 6 confederate laughing 

vs not laughing during humorous tapes, amount. of laughter by Ss, 

female college students 
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TITLE: Severity of psychiatric disorder and the 30-iteo General Health 

Questionnaire. 
AUTHOR: Finlay-Jones^ Robert A.; Murphy^ Elaine 

DESCRIPTORS:TEST VALIDITY; QUESTIONNAIRES; MENTAL HEALTH; MENTAL DISORDERS/; 
PSYCUODIAGNOSIS 

IDENTIFIERS: validity of 30-iteni General Health Questionnaire^ diagnosis of 
- severity of psychiatric disorder, 18-40 yr old female general 
practice patients vs yr old Ss with recent severe physical 

symptoms 

TITLE: Consequences for targets of aggression as a functign of 

aggressor and instigator roles: Three experiments. 
AUTHOR: Gaebelein^ Jacguelyn W- ; Mander^ Anthony 

D£SCRIPTORS:ROLES; AGGRESSIVE BEHAVIOR; ROLE PERCEPTION; ROLE EXPECTATIONS 
IDENTIFIERS:aggressor vs instigator role of Ss, intensity of aggression 
toward opponent/ female college students 

TITLE: Aggression against a remorseful wrongdoer: The effects of self- 

blame and concern for the victia. 
AUTHOR: Harrell^ tf. Andrew 

DESCRIPTORS:GaiLT; THEfT; CRIMINALS; AGGRESSIVE BEHAVIOR 

iDENTIf lERS:remorsef ul vs nonremorsef ul thief, aggressive behavior towards 
thief, female Ss 

TITLE: Interpersonal gaze and helping behavior. 

AUTHOR: Valentine, Mary E. ; Ehr lichman, Howard 

DESCRIPTORS:EYE CONTACT; HUMAN SEX DIFFERENCES; ALTRUISM; ASSISTANCE (SOCIAL 
BEHAVIOR) 

IDENTIFIERS:eye contact, helping behavior, male vs female Ss 

TITLE: Importance of imagery in maintenance of feedback-assisted 

relaxation over extinction trials. 
AUTHOR: LeBoeuf, Alan; Wilson,. Clare 

DESCRIPTOBS:IMAGERY; BIOFEEDBACK TRAINING; RELAXATION THERAPY; EXTINCTION 
(LEARNING) 

IDENTIFIERS:ase of imagery vs passive concentration during frontalis EMG 

feedback training, maintenance of relaxation during extinction 
trials, female Ss 

TITLE: Subjective estimates of body tilt and the rod^and^f rame test. 

AUTHOR: Sigman, Eric; Goodenough, Donald R.; Flannagan, Michael 

T)ESCRIPTORS:ROD AND FRAME TEST; ILLUSIONS (PERCEPTION); ESTIMATION; VISUAL 
PERCEPTION 

IDENTIFIEHS:magnitude esti-aation procedure, illusory self tilt effect in rod 
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Page 1 



Searcher 

Date Search 
Collected 



Date Search to 
be returned 



Date Returned to 
Brian iicLaughlin 



•7/3) 



Query Number <^^^ Cf/^^^^^^^^^ 



Order of ^ 
Representations 



DIALOG 
Password 



TO rZ M (5*0 /V 



Date Returned 
to NSF 



Some Important Points: 

1. ^ach new search must be started by the full BEGIN command. 

2. Be sure to print the documents retrieved. before typing the next 
BEGIN command. 

3. If no documents are retrieved, type NOTHING FOUND and print 
using Format 1, any one document. 

4. , You do not need to LOGOFF a.fter each search before starting the 
' next search. 



TO LOGON AND LOGOFF: 

The step-by-stcp sequence for connecting with the computer, for 
conducting a DIALOG search, and for disconnecting from the computer, 
is given below 



1. 
2. 
3. 
4 . 
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Jf you are using a dial-up terminal, the phone number is 423-1313 
Turn power on and hit carriage return. 
Type. LOG 34 34, 14 
Type ; NSF 
DO DIALOG 

The computer will ask for your dialog password. It is given at 
the top of this page. 



Type: BEGIN 

The computer will ask for the query number and will lock the 
search to a particular representation code. 

7 . 
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7. Carry out the search for this query. 

Remember, we want a high recall search. Refer to the DIATOM- 
DIALOG Simulator handout for a description of possible commando. 

Before starting a new search, use the PRINT command, the format 
should be 1, to have a set of the retrieved documents printed. 
If no documents have been retrieved, type in NOTHING. FOUND and 
print out any 1 docximent with FORI^IAT 1. 

8. If., you want to conduct another search (for the same query) 
- begin at Step 6. , ' 

If you are completely done searching for now, go to Step 9. 

9 . Type : LOGOFF 

10. Type: K/F 

11. Rdturn all the materials to Brian ^icLaughlin. 
HELP AInID ASSISTANCE: 



Brian McLaughlin 
210 Hubbell Avenue 
Syracuse, New York 



476-7359 (Home) 
423-2091 (Work) 



NSF Retrieval Project 
113 Euclid Avenue 
Syracuse, New York 



423-4549 (Room 304) 
or 

(Room 306) 
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Searche r . ^j>^^^^L^^ ' Query Number c^^^ 

Date Search A Order of 1 1~ O 
Collecte d V ^ R'^p recantations L ' n , 

Date Search to ^/i/ DIALOG - 

be returned V Password ST^^M iBo^ 

Date Returned to . ^ ' Date Returned 

Brian licLaughlin to NSF , 



Some Important Points: * « ' ^ 

!• Each new search must be started by the full BEGIN command. 

2. Be sure to print the documents retrieved before typing the next 
BEGIN command, 

. 3. If no documents are retrieved, type NOTHING FOUND and print" 
using Format 1, any one document. 

• 4. You do not need to LOGOFF after each search before starting the 
next search. 

TO LOGON AND LOGOFF: 

The step-by-stcp sequence for connecting with the computer, for 
- conducting a DIALOG search, and for disconnecting from the computer, 
is given below. 

1. If you are using a dial-up terminal, the phone number is 423-13^3, 

\ 

■ ■ \ 

2. Turn power on and hit carriage return. ^ 

3. Type. LOG 3434,14^^ 

4. Type; NSF - 

5. DO DIALOG ^ 

The computer will ask for your dialog password. It is given at 
the top of this page. 

6. Type: BEGIN 

The computer will ask for the query nvunber and will lock the 
search to a particular representation code. 

ER?C ? . 
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7. Carry out the search for this query. 

RGitiember, we want a high recall search. Refer to the DIATOM-- 
DIALOG Simulator handout for a description of possible commands. 

Before starting a new search, use the PRINT command, theformat 
should be 1, to have a set of the retrieved <i°^)J^;?S^%P^J!;^!^^ 
If no documents have been retrieved, type in NOTHING FOUND ana 
print out any 1 document witih FORI'iAT \. 

8. ^ If you want tp conduct another search (for the same query) 
,<. begin at. Step S. . - , 

.If you are completely done searching for now, go to Step 9. 

9 . Type ; LOGOFF . " 

10 . ^ Type-/ K/F _^ 

11. Return all the materials, to Brian ^icLaughlin. 

« 

HELP Ai>iD ASSISTANCE; 



Brian McLaughlin 
210 Hiibbell Avenue 
Syracuse, New York 



^476-7359 (Homo) 
423-2091- (Work) 



NSF Retrieval Project 
113 Euclid ®V7enue 
Syracuse, New York- 



423-4549 (Room 304) 
or 

(Room 306) 
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NSF INFORMATION RETRIEVAL PROJECT 

INSTRUCTIONS TO PARTICIPANTS 

Attached you will find a copy of your interes t^-^^tement and \ 
two copies of a list of references. List (a) is to be used as 
,part of the study and should be returned after you make your ' 
'judgements of relevance. Copy Cb)'is yours to keep. 

. Each citation is organized into seven parts: 

DN - Documejit identification number 
Tl - Title 
AU - Author 

SO - Source of the citation (i.e. journal title) 
AB - Abstract. ^ ' 

DT - Date , ^ ^ ' ' 

DE - Descriptors of the citation '\ : „^ 

Please read each citation and abstract to form an idea of what 
that partrcular document (book, article, report) is about. Compare 
this to- your interest statement, and for each citation listed, 
c^ecided how closely that citation is related to your topic. Based 
on the information in front of yon, is the citation relevant to 
your topic, or not relevant to what you had in mind. 

Use the following scale for your judgement: 

1 - Definitely relevant to your topic. 

f 

2 - Probably relevant to your topic. 

3 - Probably not relevant to your topic. 

4 - Definitely not relevant to your topic. 

^ Please rate each citation by placing the number corresponding 
to your judgement in the box immediately following each citation. 
After you have checked all the citations to see whether or not 
.they are relevant to your interest statement, please return the copy 
with the judgements to us in the pre-addressed envelope through 
campus mail. If you are not on campus, these envelopes should be 
used to return the completed forms to us through the regular mail 
service.' Thank you for your cooperation. 

, If you have any questions, please contact us at r 

^ School of Information Studies 

Syracuse University - . 

113 Euclid^ Avenue > ' . 

Syracuse, New York 13210 
423-4549 . 

6/16/80 



ERIC 



# ' ^ . . Page 74 

. ' Appe^ndlx B-2 

" Phase II . ' ^ ' ' - 

NSF- INFORMATION RETRIEVAL PROJECT - 
INSTRUCTIONS TO -PARTICIPANTS . CA) 

■ Attache^ you wilJ- find a copy oL your interest statement and 
two fiopies of a list of references. Copy (A) •is to be used as 
" part of the study and should be returned after you make yoyr 
judgements of jcelevance. Copy CB) is yours to keep. 

' Each citation is .organized into eight parts: . * 

Document identification .numb^qr 
Title • ' 

» Author : ' 

Soyrce of the, irritation • , ' * 

Section Cod^ 

Abstract - . 
Descriptors .of the citation ^ " 

Identifiers ^' ' . 

Please read each citation and abstract to form an idea of what 
that particular^ document is about. Compare this to your interest 
statementV^ and for each citation listed/decide how closely that, 
citation is related to your topic. Based on the information in 
front bf^-you, is the citation relevant to yc;lr topic, or not ■ 
relevant to what you had in mind.. " , 

Use the. following scale for your judgement: « , ^ ^ ' 
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1 - Definitely relevant to your topic, v 
\ 2 - Probably relevant .to your topic. 

3 - Probably not relevant to your topic. - ^ 

4 - Definitely not relevant to your topic. 

Please rate ^ach citation by; placing J|fc number corr^ponding 
to your judgement in the box .immediately j^PPl owing each c^pi'ation.^ 

After you have checked all citations ta see whether or not they 
are relevant to your interest statement, please return the copy 
with tiie judgements to us in the pre-addressed envelope- through^ 
campus -mail. if yo\i are *not on campus, these envelopes should 
be, used t^o return- the completed''f9rms to us through the regular 
mail service. "fhank you ^f or your' cooperation^ , 

I If you have any ''^quest ions , 'please contact ^us at ' 

' : - - - ^- - School of ^Information Studies ^ 
%■ * * \ Syracuse Univeirsity /'^ ^' " • " T 

. • , ^ ^. ' ' JJ-^ Euclid Avenue 

Syracuse, New York ' 13210 
4Z3-45*9. ' ^ 
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SCHOOL Of information studies 



Phase I 



IB FUCLID AVENJJE i>YRA(-US£. NEW YORK 13210 PHONE iJ15) 423 2911 / 



NSF IlsIFORIlATION RETRIEVAL PROJECT- 



4 ; ■ , P 

Vie are iJork-^cmg on a project which will help us under- 
stand how the p^rtineTnce, of information retrieved computer ' 
is related to tlie method by which it. Is searched/ / 

For this pro ject / we need information requests which will 
b'a searched in Computer and Coinputer Control^ Abstiract.s (ftom 
October 19 January 19 80). If you need information in 

the area of coihputers and information science ^ V7/e 
conduct a search for you free of. charge « All you have to - 4 
do is submit a search request to us and give u^' information 
on how we did after the search; . " ; / 



For the search request we would like you to describe a 
topic of interest to you; one you are working on or are 
fami^liar with, in the computer^ field. . Several days later 
you will receive a list of citations that have been retrieved 
by computer. You will be asked at that time to indicate 
which of thQse are pertinent to your interest. Op^ copy of 
the computer outpcix will returned to ixs and the other co*py 
will be forN?^r own use, 



We would very much appreciate, your cooperation and 
participation in this project. If you are willing to 
participate^ please rean the attached pages and write your 
search request in the space provided 



If you ido not need a searcfit^pleasG paps this form to 



a student. 



\ 



7/24/80 
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Phase I 



113 EUCLID AVENUE SYRACUSE, NEW YpRK 13210 PHONE (315) 423-2911 



' NSF INF0Ri4ATI0N RETRIEVAL PROJECT 



^ 7 

, As a participant in this project ue would like you to submit 
a s^earch recjuest (on the attached forra) alpout some aspect of 
computers and infoirmation science. 

We will take your r^e^est and search the current issues of 
COMPUTER AND COMt*UTER COmJmh ABSTRACTS. The Results of ;this 
search will be a list of citations to books and journal articles. 
.* ♦ • . * * 

Ue will then give you this list of citations and ask that 
you let u^ know which of these are most pertinent to your search 
request. • . - 



The enclosed form is for you 
Interest .J If you are planning a 
probably hav6 a topic in mind; if 
working ort^ consider bne vrith whi 
form^, write down your information 
talking to^a colleague who unders 
do. Don't worry about trying to 
trained people- to make sure that 
fessionaliy. * 



to describe your topic of 
talk^ qr doing a paper ^ you 

you don't have a topic you are 
ch^you are faiailiat. Usingi^this 

requirements as if you were 
tands the field as well as you 
say it in "computerese" ; we have 
your search is conducted pro?- 



Thank yqu for your cooperation. If you have any questions 



please feel free to contact us. 



r 



NSF Information Retrievcil Project 
School of Information Studies 
113 Euclid Avenue 
Syracuse, New York 13210 
(315) 423-4522 . 
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DATE. 



PHO^iE : 



^KONE i 



We v;ould like a description of your topic of in1:erest. This 
statement should be clear enough so th^t any person who also knows 
about this topic would', or> the basis of this statement alone, be 
able to pick out citations of interest for you/ 

Please write your description hero; 

















r-* 





t 



Given your purposes in requesting this search, how many citations 
do you want? . . - 

About how many citatipns on your topic do you expect to receive 
from this computer sear6h?_ 

YOU I-3AY FOLD THIS- REQUEST FORM IN THIRDS. ST^LE SECURELY, A^n^ 
•DROP m CAMPUS MAIL. ^ • 4/4/80 
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NSF INFORMATION RETRIEVAL PROJECT 



We are working on a project which will help%us understand 
how the pertinence of information retrieved by computer is related 
tp the method by which it is searched* * , . - 

For this project, we need information .rjeq^uests which will be 
seai^hed in Psycholpgical Abstracts (from July to December 1980). 
If you need information in the qorea of psychology or related fields 
included in Psychological Abstracts , wg will conduct a seai:ch for 
you free of charge . All ypu h4ve to dd is submit a search request 
to us and give us information on how we did after the search • 

7 

For the search request, we would like you to describe a topic 
Of interest to you; one you are working on or are f miliar with, in 
th^ psychQlogy field. Several days later/ you will receive a list 
of citations that have been retrieved by the computer. .You will te 
asked at that tim^ to indicate which of the3e is pertinent to your 
interest. One copy of the comput^i; ovitput will be retuxTied to us, 
and the oth^^r copy will be for your own use. — ^ 



^ / ... 

We would very much appreciate yqur cooperation and participation 

in this project. ^ Pl-ease^rcad the attached pagea and write your 
search request in the space provided/ if you are willing to 
participate. • * 



I If you '^do not n'eed a search , p3^ease pass this form to a 
student or fellow colleague. 



JULY 19 81 
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SCHOOL OF INFORMATION STUDIES 

11J EUCLID AVENUE SYRACUSE. NEW YORK 13210 PHOnE {31 5) 423 291 1 



Phase II 
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NSF INFOKIATIOK RETRIEVAL PROJECT 



As a participant in this project, we would like you to submit 
a search, request (on the attached form) about some aspect -or 
psychology or a related field. 

> We vill take your request and search Psychological Abstracts 
(July 1980 - Dec*^er 1980). The result of this search will be a 
list of citation? to journal articles . 

We will t^en give you this list of citations and ask that 
you let us know which of these are most pertinent 1x your search 
request. 



* * * 



************ 



The attached form is for you to describe your topic of 
•ihter^t. If you are- plarining a- talk or doing a paper, you 
probably have a topic in mind; if you do not have a topic you 
are v/orking on, consider one with which you are familiar. Using 
this form/ write down your information requirements as if you 
were' talking to a colleague who understands the field as v7oll 
as you do. 



* * * 



************** 



Thank you for your cooperation, 
please feel free to contact us. 



If you have any questions, 



MSP Information Retrieval Project 
School of Information Studies . 
113 Euclid Avenue 
Syracuse, New York 13210 
(315) 423-4549 



JWLY 1981 
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NAME : DATE . 



SCHOOL ADDRESS; . ' • ^PHONE : 

HaiE ADDRESS: PEpNE; 



We v/ould like a description, of your topic of interest. This 
statement should be clear enough so that any person who also knows 
about thiS' topic would; *on the basis of this statement alone^ be 
able to pick out citations of interest for you. 

e» Please write your description here; 



Given your purposes in re<^uesting this se^rph^ hovf. many citations 
do you walttt? ■ ' ' " " ' " 7 * 

About how many citations on your topic do you expect to receive 
from this computer search? 

YOU liAY FOLD THIS REQUEST FORM IN THIRDS." STAPLE SECURELY, A^^^•; 
DROP 117 CAMPUS MAIL. 
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P^gG 1 



Searcher; 



Date to Searcher: 



Date to be Returned: 
Some Important Motes; 



Search Query number 

Representation Code this Query: 
DIALOG Password 



1. > Each new query to be searched must be started by the full 
' BEGIN comraand. 

2 . You do not need to LOGOFF after each query before starting the 
^ next query. You do need to PRINT the documiants retrieved 

before typing, the BEGIN command for the new query. 

3. ^Truncation cannot be used vAth the stemming representation (ST) 



It can be used v/ith othei^ representations • 



4. Though you can use adjacency, you should know that it ma/ run ^ 
very slowly. Instead, you may choose to upe the field o^cr- 
^ ator (F) . This impi emGntation of DTATiOG will not allow the 

.use of adjacency v;ith ttuncahion, or adjacency v/ith stemming^ 

s . ^ . 

To LOGON and L OGOFF ' ' ^ 

The step--by-step sequence for connecting with the computer, for 
conducting a DIALOG soarch, and for disconnecting from the computer 
is given belov;^ . ; , 

Everything you type at the terminal must be sent to the computer ^ 
with a carriage return. 

The computer re'sponses to some of these commands are not giveri here 
1. 



If you are using a dial-up terminal, the phone number is 
423-1313. Remember, it must be a hard-copy terminal. 



2. 


Turn 


p<^;er on and hit carriage return. 


3. 


Type 


LOG 34 34,14 


4. 


Type : 


NSF 


5. 


Type: 


DO DIALOG 



The computer will ask for your dialog passv/ord. It is 
given .at the top of this page. 



Date Returned to 
Brian McLaughlin? 



Date Returnedr^ 
to NSFs > 





(5/2/80) 
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SEARCH QUERY COVER SHEET - Paige 2 



6. Type: BEGIN 

The computer will ask for tho query number and the 
representation code. Both can be found at the top of 

Pcige I. 

7. Carry out the search for this query. 

Remember > we v;ant'a hiTgh recall search wdth a maximum of 
50* documents retrieved. 

Before starting a new query you need to have the set of 
retrieved documents printed. Use ^the FRUIT command; the 
format should always be 1. 

8. -af yo.u want to search another query, look at the COVER SHEET 
, for that query '-and begin at Step 6. _ 

'If you are completely done searching for nov7, go to Step 9 . 

Type: LOGOFF ' . 

Type : K/F 



9 . 
10. 
11. 



Turn power of f , collect your materi^ils and submit them 'to 
Brian McLaughlin. 



Sub mitting Searches 



Brian McLauohlin will distribute and collect all search-^rff. rflrenr 
a search is'completed, you need to s\±imit this COVER SHEET ana a 
copy of your interaction. Queries" should be searched and 
returried within 48 hours after receiving them. 



Help and Assistance 
1. 



Brian I-IcLaughlin 
210 Kubbell Avenue 
Syracuse, New York 



476-7359 (Home) 
4 23-2091 (Work) 



2. KSF Retrieval Project 
113 Euclid. Avenue 
Syracuse, New York 



423-4522 
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P$ige 1 



Searcher 



Date Search 
Collected 



Date Search to 
be returned 



Date Re turned to 
Brian iicLaughlin 



Query Number^ 



Order of" 
Represent at ions _ 



DIALOG 
Password 



Date Returned 
to NSF 



Some important Points: 

1. Each new search must be started by the full BEGp command, 

2. Be sure to print the documents retrieved before typing the next 
" BEGIN command. ^ . 

3. If no documents are retrieved, type* NOTHING FOUND and print 
using Format 1, any one' document. ^ 

4. You do not need to LOGOFF after each searcli before starting the 
next search. 



TO LOGON' AND LOGOFF: 

The step-by-stcp sequence for connecting with the computer, for 
conducting a DIALOG search, and for disconnecting from the computer, 
is given below 

1. " If you are using a dial-up terminal," the phone number is 423-1313. 

2. Turn power on and hit carriage return. 

3. Typ^ LOG 34 34, 14 

4. Type; KSF * 



5. ^ DO DIALOG 

The computer will ask for your dialog password 
the top of this page. 

15/ ■■' Type? -BEGIN 



. is given. 



at 
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The computer will ask for the query number and will lock the 
search to a particular representation code. 

9.1 
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• SEARCH QUERY SHEET - Page 2 



7. Carry out the search for this query 



9:. 

10 . 
11, 



Roiri^itiber, wo want a high recall search. Refer- to the DIATOM- 
.DIALOG Simurator handout for a description of ^ possible commanao . 

Before starting a new search, use the PRINT command, the format 

should be 1, to have a s^t^ the retrieved documents printed. 

If no documents have been retrieved, type in WOTHING FOqWD and 

print out any 1 document with PGRI^IAT 1. ' • 

If you want to conduct another search '(for the same, query) 
begin at Step 5. , 

If you are* completely done searching for now, go tb Step 9, 
Type: LOGOFF - \ 

Type; K/F ; ; ^ * ' ^ ' ' 

Return all the materials to Srian McLaughlin. 



KELP AKD ASSISTANCE; 



1, Brian McLaugiilin^ 
210^ Hubbell Aveni^'S 
Syracuse, New York 



423-2091 .(Work) 



2\ NSF Retrieval Project 
i 13.. Euclid Aventie 
Syracuse, New York 



423^4549 (Room 304)' 
, • ■ ; - or 

(Room 306) 
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I X 


TA 


)" 


DI 


TT 


DD 


MINO 


ST 


TT 


DD 




TA 


DI 


AA 


SETT 


TT 


TA 


ST 


DI 


AA 


DD 


I I 


LA^UB 


TA 


AA 


TT 


DD 


ii 


ST 


DI 


MCUA 


I I 


.DI 


AA 


TT 


DD 


TA 


ST 


ABBO 


DI 


DD 


I I 


TA 


ST 


AA 


TT 



ERIC 



SOUftRE 4 



V 



c 




1 22 


123 


124 


1 '^S 


126 


127 


128 


lEDWA 


TA 


ST * 


11^ 


DI 


AA 


IiJCi 




VAUG 


DD" 


1 I 


TT ' 


DI 


T^ 


ST 


AA- 


& 


MINO 




AA 


ST 


I I 


TT 


DD 


TA 




SETT 


AA 


TT 


r>^ 


TA 


DD 


I I 


ST 




LAUB 


I I 


I DD 


DD 


AA 


ST 


DI 


TT 


d 


MCUA 


TT 


AA 


ST 


I I 


TA 


DI 




ABBO; 


ST 


DI 


TA 


DD 


AA 


TT 


II 
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) 



c 

Q 



9 



Phase I 



• 


> 






% 




• 


/ 

. — 






1 


c 












































s 


















i 


X 




/ 


129 


130 


131 


132 


133 


134 


135 










i:ii 


1 1 


TA 


DD 


AA 




^ ST 




/ 


Q 


VAUG 


TT 


ST 


E» I 


r A 


DD 


J. I 


^ AA 


1 

1 






MI NO 


1 1 


AA 


TT 


iSi 


TA ^ 


ST 


DD 








*St£TT 


ST 


I'D 


I I 


TT 


DI 


AA 


TA 






o 




TA 


7'T 


DD 


A A 


ST 


I 




or 






MCLA 


vv 


DI 


AA ' 


ST 


I I 


,TA 


TT 










AA 




ST 


I I 


TT 


DD 


DI 






















f 












A 


















SQUARE 6 






















136 


137 


138 


139 


140 


141 


142 










TT 


,TA 


ST. 


DI 


I I 


AA 


DD 








VAUG 


ST 


JT 


DD 


I I 


AA 


TA 


Dl' 








MI HO 


AA 


I I 


TA 


ST 


DD 


DI 


TT 






: f 


SETT 


TA 


AA 


* TT 


DD 


DI 


" I I 


ST 








LAUB 


DI 


DD 


I I 


TA 


TT 


ST 


AA 








MCL A 


DD 


ST 


'di 


AA 


TA 


TT 


I I 






r 


ABBO 


I I 

' c 


DI 


AA 


TT 


ST 


DD 


TA 






/~ 
























SC^UARE 7 






















143 


144 


145 


146 


147 


148 


149 








/ EE'WA 


TA 


TT 


ST 


I I 


DI 


AA 


DD 








VAUG 


Jo 


D I 


1 1 


TT 


T A 


ST 


AA 






. c 


MI NO 


" DI 


I I 


AA 


ST 




DD 


TA 






SETT 


AA 


TA 




DI 


DD 


I I 






























LAUB 


I I 


AA 


TA 


E«D 


ST 


DI 


TT 






<^ 


MCUA 


ST 


DD 


DI 


TA 


AA 


TT 


I I 








AE«BO 


TT 


ST 


DD 


AA 


X I 


'ta 


DI 





sgUare 8 





150 


151 


152 


153 


154 


155 


156 




ED^A 


I I 


TT 


DD 


AA 


TA 


DI 


ST 




VAUG 


DD 


AA 


TT 


DI 


I I 


ST 


TA 




MI NO 


TA 


DD 


I I 


TT 


ST 


AA 


DI 




SETT 


ST 


I I 


TA 


DD 


DI 


TT 


AA 




L.AUB 




TA 


ST 


I I 


AA 


DD 


TT 




MCUA 


AA 


. ST 


DI 


TA 


TT 


II. 


DD 




ABBO 


TT 


DI 


AA 


ST 


DD 


TA 


I I 





ERIC 
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157 1 


58 


159^ 160 


161 


162 


.163 






ST 


1 1 


DI 


TA 


TT 


DD 




TT 


DI 


TA 


AA 


ST 


DD 


I I 


MI NO 


ST 


I I 


TT 


TA 


DD 


DI 


AA 


<-iEi:TT 


I I 


TT 


DI 


DD 


AA 


TA 


ST 




DD 


AA 


ST 


TT 


DI 


I I 


TA 


MCL 


ni 


TA. 


DD 


ST 


II 


^A 


TT 




TA 


DD 


AA 


I I 


TT 


ST 


DI 








n 











164 


165 


166 


167 


168 


169 


170 


IHDWA 


A A 


TJ" 


DI 


ST 


DD 


II 


TA 


VAUG 


DI 


AA 


ST 


TA 


TT 


DD 




MiNq 

SETT 


TT 

ST 


DD 
DI 


AA 
TA 


DI 
I I 


I I 
AA 


TA 
• TT 


St I 
DD 


LAUB 


DD 


I I 


TT 


AA 


TA 


ST 


DI i 


MCLA 


TA 


ST 


I I 


DD 


DI 


AA 


TT 


, ADBO 


II 


TA 


DD 


TT 


ST 


DI AA 

r 


SRUAPrE .11 












171 


172 


173 


174 


175 


176 


177 


EDWA 


TT 


ST 


'DI 


1 1 


AA 


TA 


DD 


VAUG 


ST ' 


DD 


I I 


AA 


TA 


TT 


DI ^ 


MI NO 


I I 


AA 


JT 


ST 


DD 


DI 


TA 


SETT 


AA 


TA 


ST 


DD 


DI 


I I 


TT 


I.AUB 


DD 


DI 


AA 


TA 


TT 


ST 


I I 


MCLA 


TA 


TT 


DD 


DI 


I I 


AA 


ST 




DI 


I I 


TA 


TT 


ST 


DD 


AA 



SQUARE 12 - 





1^3 


179 


180 


181 


182 


183 


184 


DWA 


AA 


TT 


TA 


DI 


DD 


1 1 


ST 


VAUG 


DI 


. AA 


I I 


ST 


TT 


DD 


TA 




TT 


DD 


ST 


AA 


I I 


TA 


DI 


SETX 


DD 


I I 


DI 


TT 


TA 


ST 


AA 


LAUB 


I I 


TA 


AA 


DD 


ST 


DI 


TT 


MCLA 


ST 


DI- 


DD 


TA 


AA 


TT 


I I 


ABBO 


TA 


, ST 


T^ 


12 


DI 




DD 



)■ 
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o 
o 

o 

o 
o 

# 

c 
c 

i 

c 
c 



A. 





1B5 


186 


187 


188 


189 


190 


191 


HDWA 


TA 


1 1 


4PT 


A A 


ST 


DI 


DD 




DD 


TT 


Djr 


ST 


I I 


TA 


AA 


MI HO 


AA 


, t»I 


TA 


I I 


TT 


DD 


ST 


<5E£TT 


ST 


TA 


DD 


T T 


DI 


AA 


I I 




II 


DD 


AA 


DI 


TA 


ST 


TT 




^ DI 


ST 


I I 


DD 


AA 


TT 


TA 




TT 


AA 


ST 


TA 




I I 


DI 


















192 


193 


194 


195 


196 


197 


198 




TT 


DD 


AA 


DI 


ST 


TA 


1 1 


VftUG 


DD 


I I 


TT 


AA 


DI 


ST 


TA 


MI HO 


DI 


AA 


ST 


TA 


I I 


DD 


TT 


SETT 


I I 


TA 


DD 


TT 


AA 


DI 


ST 


LAUB 


AA 


TT 


DI 


ST 


TA 


1 1 


DD 


MCLA 


ST 


DI 


TA 


II 


DD 


TT 


AA 


APJEcO 


TA 


ST 


"^I I 


DD 


TT 


AA 


DI 



) 



O 
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Random Query Order ^ . 




201 LAUBf n I T A 

V 201 mcla( n I T A 

201 MINO i?L t' D I 

201 SrOR- ATI I T 

^ 202 LAUEi A I T D 

202 MCLA D A .1 f 
•:?02 MI NO D A I T 

202 SrOR I T A D 

203 LA'IJB -A I T D ' 
203 MCLA/T. D I A 

203 f^^INO D I A T 
-ioZ STOR T D I A 

20^4 LAUB I D J A 

204 MCLA T D A I 
204 Mr>JO A T 'I D 

204 S*TOR D I A T 

205 LAliB D I A T 
205 MCLA I A D T 

205 MINO-A DTI 

205 STOR D T I A 

206 LAUB^ D T L A - 
20^' MCLA I T D A 
'206 "MINO A T D I 

206 STOR A T I D - 

' 207fNLAUB B A T I 

207 MCLA I B T A 
207 MINO I A B *T 

207 STOR T B A I 

208 LAUB B A T I 
208 MCLA B A" T I 
208 MINO BAIT 

208 STOR T A I D 

209 LAUB B A T I 
209 MCLA' B I T A 
209 MINO A T B I 

209 STOR T B A I 

4 - • 

210 LAUB B I T A 
^ 210 MCLA I A^ B T- 

210 MINO B T I A 
210 STOR T I A B 



211 LAUB T I A n 

211 MCLA A B T I 

211 MINQ T A B I 

211 STOR I B T A 

212 LAUB I ff T A 
21? MCLA T B I A 
212 MINO B A .1 T 
212 STO?J A I B T 



213 A I B T 

213 MCLA I T A B 

213 MINO B A T I 

213 STOR A B T I 



214' LAUB B 'A T I 

214 MCLA A B T I 

214 MINO T A B I 

214 STOR T A I B 



^ 215 LAUB A I T B 

215 MCLA A I. B. T 

215 MXNO-A I B T 

215 STOR I T A B 



216 LAUB I B T A 

216 MCLA A B T I 

216 MINO A.B.I T 

216 STOR B A T I 



217 LAUB A T B I 

217 MCLA ABIT 

217 MINd I T A ,B 

217 STOR ABIT 

18 LAUB B I A J 

218 MCLA T I B A' 
218 , MINO T A B I 

218 STOR AIT J\ 

J 

219 LAUB I T A B 
219 HCLA I II A T 
219 MINO B I T A 
219 STOR B T A I 



220 LAUB B I T A 

220 MCLA B I AT 

220 MINO A I T .,B 
220^ STOR T I A B 

221 LAUB T B- A I 
221 MCLA BAIT 
221 MINO IT A B 

221 STOR A I B T 

222 LAUB I A El T 
222 MCLA T B A I 
222 MINO I B A T 

222 STOR T A B I 

223 Li^UB B T I A 
223 MCLA I T -A B 
223 MINO A T B I 
223 STOR ABIT 



224 LAUB I T B A 
224 MCLA T B I A 
224 MINO B I A T 
224 STOR I A B T 



225 LAUB A T I B 

225 MCLA .A B T I 

225 MINO I T A B 

225 STOR A T B I 



226 LAUB D A T I 

226 MCLA A B I-T 

226 MINO T I A B 

226 STOR B I T A 



, 227 LAUB T A I B 

227 MCLA B I T A 

227 MINO A I T B 

227 STOR T B I A 



228 LAUB T B A I 

228 MCLA I B A T 

228 MINO B I A T 

228 STOR A T I B 



0. 
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229 l-AHD n I r A 

229 MCI. A D A I T 

229 MINO A T n I 

229 STOR I A n T 

230 LAUB A. n I T 
230 MCLA A I T n 
230 MI NO I AD T 

230 sruR A T I n 

231 LAUB T I n A 
231 MCLA D T A I 
231 'MIND T I n A 

231 GTOR T I A n 

232 LAUB T A D I * 
232 MCLA I n T A 
232 Mlf^O I T A n 

232 STOR I T A n 

233 LAUB A n T I 
233 MCLA A T B I 
233 MINO D A T I 

233 STOR T D I A 

234 LAUB I T D A 
234 MCLA T I D A 
234 MINO A D T I 

234 STOR T I D A 

235 LAUB A D. I T 
235 ^ffcLA D T I A 
235 MINO D A I T 

235 STOR A I T D 

236 LAUB D T A I 
236 MCLA I T D A 
236 MINO T D A I 

236 STOR A T I D 

237 LAUB T D A I 
237 MCLA A T D I 
237 MINO T D I A j 
237 STOR A T I D 



233 LAUB I A D T 

238 MCLA I D T A 

238 MI^^^O T D I A I 

233 STOR I D T A 



239 LAUB D A I T 

239 MCLA T -A I D 

239 MINO I D T A 

239 STOR I D T A 



240 LAUB D T I A 

240 MCLA A. T I D 

240 MINO A I T D ' 

240 S^TOR T D A I 

241 LAUB D T A I 
241 MCLA D I T A 

?24l MINO T D A I^ 
241 STOR A T I d' 



24:< LAUB T I A D 

242 MCLA D I T A 

242 MINO r D A I 

242 STOR A D T I 

243 LAUB D A T I 
243 MCLA D A I "T 
243 MINO A T I D 

243 STOR/ ADIT 

244 LAUB T A I D 

244 MCLA D T A I 
2-44 MINO A T I D 

' 244 STOR TAD I , 

245 LAUB I T D A 
245 MCLA T A I D 
245 MINO T I D A 

245 STOR T I A D 

246 LAUB B T I A 
246 MCLA I T A D 
246 MINO A D T I 

246 STOR T A I D 

247 LAUB A T I D 
247 MCLA T I A D 
247 MINO D -I A T 
247 STOR A, T D. I 



248 LAUB D I A T 

248 MCLA I A T 

243 MINCr-T- I- A D-- 

248 STOR D A T I \ 



249 LAUD T D A I 

249 MCLA A I T D 

249 MINO A T q I 

249 STOR .,A T D I 

250 LAUB T D A I • 
250 MCLA T I B A 
250 MINO D I T A 

250 STOR I T A. D 

251 LAUB I T A D 
251 MCLA D I A T 
251 MINO I A T D 

251 STOR n T I A 

252 LAUB T A D I 
252 MCLA D T A I 
252 MINO I .A D T 

252 STOR I D A T 

253 LAUB A T D I 
253 MCLA A T I D 
253 MINO A I T D 

253 STOR Ei A I T 

254 I..AUB I A T D 
254 MCLA I T A D 
254 MINO T A D I 

254 STOR, T A I D 

255 LAUB D I T A 
255 MCLA A T I D 
255 MINO A I D T 
255 STOR I T D A 



256 LAUB D I A T 

256 MCLA T I B A 

256 MINO T I A D 

256 STOR D A T I 

257 LAUB D A T I 
25i7 MCLA A T I D 
257 MINO I D A T 
257 STOR D^T A I 
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Phase .1 



AOV SUMMARY TABLE: Recall-1 



Source 




OUIU QX 

Squares 


df 


1*1 dli. 

Square 


'f. 


Between Squares 


2.6 24. 


11 


. 239, 




Queries in Squares 


10 .415 


58 


. 180 ■ 




Searchers 


4.072 


6 


- .679 




Squares X Searcher 


7.940 


66 


. 120 




Representations ^ 


1.415 


" .6 


12 36 


3.324* 


Squar^e^ ^ Representation 


6.0 21 


66 


.091 


. 1.282^* 


Residual ^ 
(by subtraction) 


19 .714 


276 


.071 


\ 

( 

\ 


Total V ' 


' 52.201 

1 


489 ' 







*Region Of rejection begins at 2'.14 =.05) or 2.89 =-0^) 

**Region of rejection begins ^at ^1 . 12 =.25). Since obtained 

value falls within the region of rejection, the square X ^ 
repre'sentation source of variation is not pooled into the 
residual. ^ ^ 

. ■ ^ V ^ - • 

NOTE 1: Tukey's HSD region of rejection = 4.17 
standard error - .0 318 

NOTE 2 : Missdng- values in the data (14 queries retrieved* no 
highly relevant documents) required a least squares^ 
solution, to the analysis. Tliis approach exceeded 
the limits of the computjer. Approximatiq^ methods 
were then employed." 



. er|c 
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, \/ ' Appendix F 

Phase I 

/ AOV SUMMARY TABLE: Recall-2 ' 



Source • " 

/ 




Sum of 
Squares 


df 


Mean 
Square 


F 


Squares 




.963 


11 


.088 




Queries ' in Squares . 




5 678 


6 5 


.0 87 




Searchers 




4.088 


6 


.68'l 


\ 


Squares X Searchers 




'4.842 


66 


.073 




Repres entations 

Poo Ted Error 

(by subtraction) 




^ 1.032 J 
19 .038- 


[ , 

":^84 


.172 
.0 50 


3.44* 


Total 


35.641 


538 







T 



*Region of rejection b'fegins at 2.14 =.0 5) or 2.89 ^<y^ =.01) 

NOTE 1: Tukey's HSD region of rejection = 4.17 
standard error = .0255 

NOTE 2: Missing values in the data (7 queries retrieved no 
relevant documents at all) required a least squares 
sbluiiion to the^analys is . This approach exceeded 
the limits of ' the computer. Approximation methods 
werjl then employed. 
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AOV SUMMARY TABLE; Precisioi^-1 



Soxarces 



Sum of 
Squares 



df 



Mean 
Square 



Squares 

Queries in Square's* 

Searchers 

'--Squares X Searchers 

Representations ^ 

Pooled Error 

(Joy subtraction) 



3.536 
15.066 
0 .528 
3.740 
0.219 
15.829 



11 
72 
- 6 
66 
6 

•360 



. 321 
.209 
.088 
.057 
.0 365> 
\044 



.829 Cn.s. 



) 



Total 



38.918 



521 



*Missing values in the data^C66 cases with documents retr-ieved) 
required a least squares solution to the analysis. This - 
approach exceeded the -limits of the computer. Approximation 
methods were then employed which results in more than one 
value for the Queries in; Squares, sum of squares. The value 
given above is the smaller of the tWQ values , which led to ^.a 
slightly larger value for the Error ^um of sq^uares . The . 
approach is conservative in the sense that if the effect of 
representations were to be sign±f,icant , it would also be . 
significant if the- other value for the Queries in Squares sum 
of squares were used. \ . 
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Phase I 



AOV SUMMARY, TABLE: Ptecisionr2 





Sum of 




Mean 


/ 


Sources v * 


Squares 


df 


Square' 


* F 
/ 


Squares 


5 . 4 89 


11 


.499 




Queries in Squares* 


19 . 886 


72 


.276 










Searchers 


0 .691 


6 


.115 




Squares X Searchers' 


5'. 34 8 


66 


.081 




Representation 


0 .364 


6. 


.0607 - . 


1.05 (n.'s.) 


Pooled Error 


• 20 .788 


360 


.0577 




(by subtraction) 










Tota^ 


52.566 


521 







♦Missing' values in -the data .C66 cases vsp-tji no documents retrieved) 
required a least squares solution to the analysis. This- . 
approach exceeded the limits of the computer. Approxomation 
methods wete J:hen employed which resulted in more than one 
value for the Queries ' in Squares sum of squares. The value 
given above is the smaller of the tWo values , which led to a 
slightly larger value for the Error sum of squares. The 
approach is conservative in the s^se that if ^.the effect .of < 
representations werd to be .significant, it would alsp be 
significant if .the other value for the Queries in Squares 
sum of squares were used. , 
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Suras of 




, Mean 




Sources 


Squares 


df 


Square 




Between Squares 


J.U b oo . J4 / 


X X 


^ / X • u u o 




Queries in Squares' 


40 273. 878 


72 


559 .359. 


f 


Searchers ^ \^ 


19 316.177 


6 ' • 

* • 


3219.363 


* 


Squares X Searchers 


13719 .415 


66 


27o .870 




Representations 


3654 .511- 


6 


60.9.0 85 


4 . 24 * 


Residual , 


61236.183 


^26 


143.747 




Total 


14 88 88.511 


- 587 * 


> 





*Region of rejection begins at 2^14 Xp< =.0 5) or 2.89 ( =.01) 

-NOTE: Tukey's HSD region of rejection = 4.17 
standard error - 1.308 . * 
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iphase II 


I 






.AOV SUMMARY TABLE: 


Recall* 


-1 




* 

0 

Source 


■* 

^^&um of 

Squares 


df 


Mean 
Square 


■ 

P 


Searcher 


0.652 


3 


0\ 217 


3.91** 


Representation 


.0 .868 


3 ^ 


0 . 2 89 


5.20** 


Searcher X 

Repres entation 


0.101 


9 


0.011 


0 .20 


within Cell 

* ' 


38*535 


69 3 

1 


0.056 





*attached to an F statistic indicates that the probabilitf of 
obtaining thlat value by chance alone is |-ess than 5%. 



*attached to an F statistic indicates tha 
obtaining that value by chance alone is 



t the probability of 
Less than 1%. 



NOTE 1: Analysis of variance of the Phase II data was 
preceded by a multivariate test o^f all five 
dependent variables. Any observatioft that was 
"missing" on one -or more 'of the^e variables was 
automatically eliminated 'for ai:. five of the 
variables. Consequently, the degrees of freedom 
for the Analysis of Variance Summary tables are ' 
abased on the remaining observations. The Tables 
of Means (Table 6^ and 8) , however, are based on 
the number of observations remaining after the 
missing values were eliminated "from that variable 
Qnly. ' • . 
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Appendix G 



Phase II 
AOV SUMMARY TABLE; Recall-2 



1 

Source 


Sura 

Squares , 


df 


Mean 
Square 


F 


^archer. 


0.628 


3 


■^0 .209 


6.92** 


Representation *■ 


0.778 


3 


0 .259 


8.57** 


Searcher 

'^epresefntation 


0.153 - 


9 


0 .017 


0^6 


Within Cell 


20.952 


693 


0 . 030 





♦attached to "nn F statistic indicates that the probability of 
obtaining tha"c\ vaJLue by chance alone is less than 5%, 

^♦attached to an F statistic indicates that the probability of 
obtaining that valiae by chance alone is less than. 1%. 



NOTE -1; 



NOTE 2: 



Analysis of variance o 
by a multivariate test 
Any observation that w 
these variables was au 
five of the variables, 
freedom for the 'Analys 
are based on the r^mai 
of Means (Tables 6 apd 
number of observations 
values were eliminated 



f the Phase II data was preceded* 
of all five dependent variables, 
as "missing" on one or more of 
tomatically eliminated for all 
Consequently ; the degrees of 
is of Variance Summary tables 
ning observations. The Tables 
8) , however, are based on the 
remaining* after the missing 
from that variable only. 



Using Tukey's HSD procedure for the PsychAbs data 
base results, the region of rejection {<y< =.05) 
begins at 3.63. The standard error and the- minimal 
difference that would be significant between any two 
representation means are 0.013 and 0.047. . ^ 
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V 



Phase II 



AOV SUMMARY TABLE: Precis ion-1 



Source I 


Sura of 
Squares 

'— i 


df 


Me an 
Square 






Searcher 


0.216 


3 


0 .072 




0.86 


Representation 


0.417 


3 


' 0 .139 




1.66 


* 

Searcher X 


0.198 


9 


0 .0 22 


\ 


0.26 


Representation 












\i ^ 
Within Cell ^.^^-^ 


58.128 


693 


0.^84 







^attached to an F statistic indicates that the probability of 
obtaining that value by chance alone is less than^5%. 

'^attached to an F statistic indicates that the probability of 
. obta'ining that value by chance alone is less than 

«^ . ' ' 

NOTE 1: Analysis of variance of the Phase II data was preceded 
by a multivariate test of all five dependent variables. 
Any observation that was "missing" on one or more of 
these vairi.ables was automatically eliminated for all 
five of the variables. Consequently, the degrees of 
freedom for. the Analysis of Variance Summary tables 
are based on the remaining observations. The Tables 
of Means (Table 6 and 8), hdwever'v are based on the 
number of observations remaining after the missing 
values were eliminated from that yariable only ^ 
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1 

Source 


Sum of 
Squares 


df 


Mean 
Square 




Searcher 


\ 

0.337 


^ - 3 


0.112 


1.19 


Representaji^on 


1.670 


* 3 


0.55^ 


5.91** 

t i 


Searcher ^ 

Representation 


0.289 


9 . 


0.032 

r 


p . 34 

t 


Within Cell ^- 


65.250 

7 


693 


0.094 





♦attached to an F statistic indicates that the probability of 
obtaining that value by chance alone is less than 5%. 

**attached to an F statistic indicates that the probability of 
obtaining that value % chance alone is less than 1%. 



NOTE 1: 



■■.a*',,_ 

Analysis of vari^^^'h^e ^ase II data was preceded. 



Analysis or vaxx<^y^i.w?sl^»^J^^^ - ^ • i^n ^ 

by a multivariate ,ti^,of" all five dependent variables 
Any observation that^l#a^s "missing" on Qne or more of 
these variables was, automatically eliminated for ail 
five of the variable's. Consequently, the ^egrees of 
freedom for the Analysis of Variance Summa^ ^S^lf^ 
are based on the remaining observations. The TaiDies 
of Means (Tables 6 and 8), however, are basedon the 
number of observations remaining after the missing 
values were eliminated from that variable only. 



N0TE2: Using, Tukey's HSD procedure for the PsY^h^^.^f^^ 
■base results, the region of rejection (<?< -. Ub; 
^ ■ beqins at 3.63. The standard error and the minimal 
. ' differenc-e that would be significant between any two 
• repre-sentation means are 0 .023 and 0 .084. ^ 
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..^r Bhase II 
AOV SUMMARY TABLE; To tal-Re tr ieved 



Source 


Sum of 
Squares 


df 


Mean 
Square , 


F 


Searcher , 


6379 .012 


3 


^CTT" 

2126.337 


9 .54** 


Representation 


8673.786 


3 


2891 .262 


12.98** 


Searcher X 

Repres entation 


4463.48a 

f 


9 


'495.942 




Within Cell . 


154393.334 


693 
> 


222.790 

. 




. • . . J" 



*attached to "an F 'statistic indicates that the probability of 
obtaining that value 'by chance alone is less than 5%. 

'*attached to art f statistic indicates that the '|)robabili ty of 
obtaining that value by chance alone is less than 1%. 



NOTE 1 : 



Analysis of variance of the Phase II data was preceded 
by^a multivaria'te test of all five dependent variables, 
Any observation that -was "missing" on one or more of 
these variables was automatically eliminated for all 
five of the Variables. Consequently, the degrees of 
freedom for the Analysis of Variance Sxaramary tables 
^■r^ H^cpd on the remaining observations. The Table: 



are based on the remaining observations. rne icujies 
of Means (Tables 6 and «) , however, ate based on the 
of observations remaining after -the missing 
eliminated from that variablife/'pnly . 



number 
valjaes were' 



NOTE 2: Using Tukey ' s HSD procedure for the PsychAbs data 
base results, the region-Pf rejection ( --O?) 
begins at 3.63. The standard error and the mimical 
difference that would be significant between any two 
representation means are 0 .-0^23 and 0 .0 84. ^ 
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• ^ Appendix H ^ 
1 




f 


1. Proof that r " is a l^odiict of the r^'s, 

^ ^ o • • • ri 

\ 

• 

Let d be a relevant retrieved document, R-j_ is the i 
representation and is the recall achieved by that 
representation. Then, 












rino r. = Prob(d is retrieved by at least ohe of the R,) 
= 1,- Prob(d is not retrieved by any^of the R^) 






. " n ' ^ i 

= 1 - n ProbCd. is not retrieved by R - )T* 
. i=l 


* 




n 

= 1 - n (1 - Prob(d is retrieved by R^)) 
i=l 






n 

= 1 - n tl - r. ) 

•1=1 ^ ^ 


/ 




*NOTE: This step depends upon the independence assumption. 

'" ^ 






See sectian VII-C of .this report. 

• 




% 
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0 


* 
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Appendix H 

V ^ 
p2. Proof that asymmetric overlaps equals r^ under the 

independenpe assixmptian. 
> i« 

'■ . ^ 

For and R^, ^ 



12 — ' ^ 7 : ^ . ' 



= + r2 - 1 + ^1 - r^) (1 - 

f ^1 ■ 



^2 



NOTE: recall obtained by relevant dQcuments retrieved 

by either R, or R^. 



( • 



11/ 

1 • 



